seonglae commited on
Commit
563d486
1 Parent(s): 8003135

docs: update model card get started

Browse files
Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -2,6 +2,37 @@
2
  license: mit
3
  tags:
4
  - auto-gptq
 
 
 
5
  ---
6
 
7
- This model should use [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) so inference from Hugging face might not working
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
  tags:
4
  - auto-gptq
5
+ - opt
6
+ - gptq
7
+ - 4bit
8
  ---
9
 
10
+ This model should use [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) so you need to use `auto-gptq`
11
+
12
+ ```py
13
+ from transformers import AutoTokenizer, pipeline, AutoModelForCausalLM, LlamaForCausalLM, LlamaTokenizer, StoppingCriteria, PreTrainedTokenizerBase
14
+ from auto_gptq import AutoGPTQForCausalLM
15
+
16
+ model_id = 'seonglae/opt-125m-4bit-gptq'
17
+ tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
18
+ model = AutoGPTQForCausalLM.from_quantized(
19
+ model_id,
20
+ model_basename=model_basename,
21
+ trust_remote_code=True,
22
+ device='cuda:0',
23
+ use_triton=False,
24
+ use_safetensors=True,
25
+ )
26
+
27
+ pipe = pipeline(
28
+ "text-generation",
29
+ model=model,
30
+ tokenizer=tokenizer,
31
+ temperature=0.5,
32
+ top_p=0.95,
33
+ max_new_tokens=100,
34
+ repetition_penalty=1.15,
35
+ )
36
+ prompt = "USER: Are you AI?\nASSISTANT:"
37
+ pipe(prompt)
38
+ ```