How to download the model with transformer library

#6
by Rick10 - opened

Hi,
How could we download the model with transformer library. The script provided here does not work.

Script:

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Meta-Llama-3-8B-Instruct-FP8")
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Meta-Llama-3-8B-Instruct-FP8")

Error:
ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq']

Neural Magic org

Hi @Rick10 - what version of transformers are you using?
To use the AutoModelForCausalLM. from_pretrained pathway, you'll need v4.45

Hi @dsikka - Thanks for the info. I updated to 4.45.2, but it still got ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao']

Neural Magic org

@Rick10 The model is not quantized through compressed-tensors which is why it is throwing an error

Neural Magic org

@Rick10 - this is an older checkpoint with an experimental format (AutoFP8)

Can you useneuralmagic/Meta-Llama-3.1-8B-Instruct-FP8-dynamic?

Thanks, I think I can just simply use snapshot_download to download the model.

Rick10 changed discussion status to closed

Sign up or log in to comment