How to download the model with transformer library

by Rick10 - opened 17 days ago

Discussion

Rick10

17 days ago

Hi,
How could we download the model with transformer library. The script provided here does not work.

Script:

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Meta-Llama-3-8B-Instruct-FP8")
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Meta-Llama-3-8B-Instruct-FP8")

Error:
ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq']

dsikka

Neural Magic org 16 days ago

Hi @Rick10 - what version of transformers are you using?
To use the AutoModelForCausalLM. from_pretrained pathway, you'll need v4.45

Rick10

16 days ago

Hi @dsikka - Thanks for the info. I updated to 4.45.2, but it still got ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao']

dsikka

Neural Magic org 16 days ago

@Rick10 The model is not quantized through compressed-tensors which is why it is throwing an error

robertgshaw2

Neural Magic org 16 days ago

@Rick10 - this is an older checkpoint with an experimental format (AutoFP8)

Can you useneuralmagic/Meta-Llama-3.1-8B-Instruct-FP8-dynamic?

Rick10

3 days ago

Thanks, I think I can just simply use snapshot_download to download the model.

Rick10 changed discussion status to closed 3 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment