Update README.md
Browse files
README.md
CHANGED
@@ -28,20 +28,30 @@ It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQi
|
|
28 |
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/CAMEL-33B-Combined-Data-GPTQ)
|
29 |
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/baichuan-inc/baichuan-7B)
|
30 |
|
31 |
-
## Experimental first GPTQ, requires
|
32 |
|
33 |
This is a first quantisation of a brand new model type.
|
34 |
|
35 |
-
It will only work with AutoGPTQ, and only
|
36 |
|
37 |
-
To merge this PR, please follow these steps to install AutoGPTQ from source:
|
|
|
38 |
```
|
39 |
pip uninstall -y auto-gptq
|
40 |
-
git clone
|
41 |
-
cd
|
42 |
GITHUB_ACTIONS=true pip install .
|
43 |
```
|
44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
## Trust Remote Code
|
46 |
|
47 |
As this is a new model type, not yet supported by Transformers, you must run inference with Trust Remote Code set.
|
@@ -59,7 +69,6 @@ The example given in the README is a 1-shot categorisation:
|
|
59 |
Hamlet->Shakespeare\nOne Hundred Years of Solitude->
|
60 |
```
|
61 |
|
62 |
-
|
63 |
## How to easily download and use this model in text-generation-webui
|
64 |
|
65 |
Please make sure you're using the latest version of text-generation-webui
|
@@ -78,7 +87,7 @@ Please make sure you're using the latest version of text-generation-webui
|
|
78 |
|
79 |
## How to use this GPTQ model from Python code
|
80 |
|
81 |
-
First make sure you have the [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ)
|
82 |
|
83 |
Then try the following example code:
|
84 |
|
@@ -86,7 +95,9 @@ Then try the following example code:
|
|
86 |
from transformers import AutoTokenizer
|
87 |
from auto_gptq import AutoGPTQForCausalLM
|
88 |
|
89 |
-
model_name_or_path =
|
|
|
|
|
90 |
|
91 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
|
92 |
|
@@ -112,10 +123,10 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
|
112 |
|
113 |
**gptq_model-4bit-128g.safetensors**
|
114 |
|
115 |
-
This will
|
116 |
|
117 |
* `gptq_model-4bit-128g.safetensors`
|
118 |
-
* Works only with AutoGPTQ,
|
119 |
* Requires `trust_remote_code`.
|
120 |
* Works with text-generation-webui, but not yet with one-click-installers unless you manually re-compile AutoGPTQ.
|
121 |
* Parameters: Groupsize = 128. Act Order / desc_act = False.
|
|
|
28 |
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/CAMEL-33B-Combined-Data-GPTQ)
|
29 |
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/baichuan-inc/baichuan-7B)
|
30 |
|
31 |
+
## Experimental first GPTQ, requires latest AutoGPTq code
|
32 |
|
33 |
This is a first quantisation of a brand new model type.
|
34 |
|
35 |
+
It will only work with AutoGPTQ, and only using the latest version of AutoGPTQ, compiled from source
|
36 |
|
37 |
+
To merge this PR, please follow these steps to install the latest AutoGPTQ from source:
|
38 |
+
**Linux**
|
39 |
```
|
40 |
pip uninstall -y auto-gptq
|
41 |
+
git clone https://github.com/PanQiWei/AutoGPTQ
|
42 |
+
cd AutoGPTQ
|
43 |
GITHUB_ACTIONS=true pip install .
|
44 |
```
|
45 |
|
46 |
+
**Windows (command prompt)**:
|
47 |
+
```
|
48 |
+
pip uninstall -y auto-gptq
|
49 |
+
git clone https://github.com/PanQiWei/AutoGPTQ
|
50 |
+
cd AutoGPTQ
|
51 |
+
set GITHUB_ACTIONS=true
|
52 |
+
pip install .
|
53 |
+
```
|
54 |
+
|
55 |
## Trust Remote Code
|
56 |
|
57 |
As this is a new model type, not yet supported by Transformers, you must run inference with Trust Remote Code set.
|
|
|
69 |
Hamlet->Shakespeare\nOne Hundred Years of Solitude->
|
70 |
```
|
71 |
|
|
|
72 |
## How to easily download and use this model in text-generation-webui
|
73 |
|
74 |
Please make sure you're using the latest version of text-generation-webui
|
|
|
87 |
|
88 |
## How to use this GPTQ model from Python code
|
89 |
|
90 |
+
First make sure you have the latest [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) installed from source as mentioned above.
|
91 |
|
92 |
Then try the following example code:
|
93 |
|
|
|
95 |
from transformers import AutoTokenizer
|
96 |
from auto_gptq import AutoGPTQForCausalLM
|
97 |
|
98 |
+
model_name_or_path = 'TheBloke/baichuan-7B-GPTQ'
|
99 |
+
# Or you can clone the model locally and reference it on disk, eg with:
|
100 |
+
# model_name_or_path = "/path/to/TheBloke_baichuan-7B"
|
101 |
|
102 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
|
103 |
|
|
|
123 |
|
124 |
**gptq_model-4bit-128g.safetensors**
|
125 |
|
126 |
+
This will currently only work with the latest [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), compiled from source.
|
127 |
|
128 |
* `gptq_model-4bit-128g.safetensors`
|
129 |
+
* Works only with latest AutoGPTQ, compiled from source.
|
130 |
* Requires `trust_remote_code`.
|
131 |
* Works with text-generation-webui, but not yet with one-click-installers unless you manually re-compile AutoGPTQ.
|
132 |
* Parameters: Groupsize = 128. Act Order / desc_act = False.
|