AWQ models with transformer pipeline
#3
by
RaviNaik
- opened
Hi, is there any way to use AWQ quantized models with transformers pipeline? Getting this error when tried to use the AWQ model with pipeline.AttributeError: 'MistralAWQForCausalLM' object has no attribute 'config'
Thanks
I am having the same problem
@Lord-Goku
Please use model.model
instead of just model
for model argument of the pipeline method.
Ex:
from transformers import pipeline
pipe = pipeline(
"text-generation",
model=model.model,
tokenizer=tokenizer,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.95,
top_k=40,
repetition_penalty=1.1,
)
print(pipe(prompt)[0]["generated_text"])
Yeah it works
@TheBloke
, tested it out with TheBloke/Mistral-7B-Instruct-v0.1-AWQ
.
One more issue I noticed is that AutoAWQForCausalLM.from_quantized
by default loads the model into cuda:0
or device0 and it looks like there is no device_map
kind of argument we can pass to specify the device index. Hope these issues will be solved once AWQ is natively supported in Transformers.