|
# how to use |
|
need install "pip install git+https://github.com/cczhong11/AutoGPTQ" before https://github.com/PanQiWei/AutoGPTQ/pull/189 got merged |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig |
|
quantized_model_dir = "cczhong/internlm-chat-7b-4bit-gptq" |
|
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True) |
|
model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir, device="cuda:0",trust_remote_code=True) |
|
response, history = model.chat(tokenizer, "你好", history=[]) |
|
``` |