cczhong
/

internlm-chat-7b-4bit-gptq

Feature Extraction

Model card Files Files and versions Community

internlm-chat-7b-4bit-gptq / README.md

cczhong's picture

Create README.md

09d6024 over 1 year ago

|

history blame contribute delete

590 Bytes

	# how to use
	need install "pip install git+https://github.com/cczhong11/AutoGPTQ" before https://github.com/PanQiWei/AutoGPTQ/pull/189 got merged
	```
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
	quantized_model_dir = "cczhong/internlm-chat-7b-4bit-gptq"
	tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True)
	model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir, device="cuda:0",trust_remote_code=True)
	response, history = model.chat(tokenizer, "你好", history=[])
	```