wizcat
/

internlm-chat-7b-v1_1-gptq

Feature Extraction

4-bit precision

Model card Files Files and versions Community

wizcat commited on Nov 12, 2023

Commit

593e89f

•

1 Parent(s): 2bc8f05

Update README.md

Files changed (1) hide show

README.md +30 -0

README.md CHANGED Viewed

@@ -3,3 +3,33 @@ license: other
 license_name: internlm-license
 license_link: https://huggingface.co/internlm/internlm-chat-7b-v1_1
 ---

 license_name: internlm-license
 license_link: https://huggingface.co/internlm/internlm-chat-7b-v1_1
 ---
+internlm-chat-7b-v1_1をGPTQ変換したモデルです<br>
+利用に当たってはhttps://huggingface.co/internlm/internlm-chat-7b-v1_1のライセンスに従って下さい<br>
+<br>
+推論用コード<br>
+```
+import torch
+import time
+from transformers import AutoTokenizer, AutoModelForCausalLM,GPTQConfig
+model_path = r".\internlm-chat-7b-v1_1-gptq"
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+gptq_config = GPTQConfig(bits= 4 , disable_exllama= True )
+model = AutoModelForCausalLM.from_pretrained( model_path , device_map= "auto" , quantization_config = gptq_config,trust_remote_code=True)
+model = model.eval()
+history = []
+while True:
+    txt = input("msg:")
+    start_time = time.perf_counter()
+    response, history = model.chat(tokenizer, txt, history=history)
+    print(response)
+    end_time = time.perf_counter()
+    elapsed_time = end_time - start_time
+    print(f"worktime:{elapsed_time}")
+```