A 8bit version of Model
#12
by
varun500
- opened
- README.md +9 -41
- quantize_config.json +0 -5
- tokenizer.json +0 -0
- tokenizer_config.json +1 -1
- vicuna-13B-1.1-GPTQ-4bit-128g.latest.safetensors +3 -0
README.md
CHANGED
@@ -1,21 +1,7 @@
|
|
1 |
---
|
2 |
license: other
|
3 |
inference: false
|
4 |
-
pipeline_tag: conversational
|
5 |
---
|
6 |
-
<!-- header start -->
|
7 |
-
<div style="width: 100%;">
|
8 |
-
<img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
|
9 |
-
</div>
|
10 |
-
<div style="display: flex; justify-content: space-between; width: 100%;">
|
11 |
-
<div style="display: flex; flex-direction: column; align-items: flex-start;">
|
12 |
-
<p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
|
13 |
-
</div>
|
14 |
-
<div style="display: flex; flex-direction: column; align-items: flex-end;">
|
15 |
-
<p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
|
16 |
-
</div>
|
17 |
-
</div>
|
18 |
-
<!-- header end -->
|
19 |
# Vicuna 13B 1.1 GPTQ 4bit 128g
|
20 |
|
21 |
This is a 4-bit GPTQ version of the [Vicuna 13B 1.1 model](https://huggingface.co/lmsys/vicuna-13b-delta-v1.1).
|
@@ -35,12 +21,18 @@ I have the following Vicuna 1.1 repositories available:
|
|
35 |
**13B models:**
|
36 |
* [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
|
37 |
* [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
|
38 |
-
|
39 |
-
|
40 |
**7B models:**
|
41 |
* [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
|
42 |
* [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
## How to easily download and use this model in text-generation-webui
|
46 |
|
@@ -122,30 +114,6 @@ Then link that into `text-generation-webui/repositories` as described above.
|
|
122 |
|
123 |
Or just use `vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt` as mentioned above, which should work without any upgrades to text-generation-webui.
|
124 |
|
125 |
-
<!-- footer start -->
|
126 |
-
## Discord
|
127 |
-
|
128 |
-
For further support, and discussions on these models and AI in general, join us at:
|
129 |
-
|
130 |
-
[TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
|
131 |
-
|
132 |
-
## Thanks, and how to contribute.
|
133 |
-
|
134 |
-
Thanks to the [chirper.ai](https://chirper.ai) team!
|
135 |
-
|
136 |
-
I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.
|
137 |
-
|
138 |
-
If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects.
|
139 |
-
|
140 |
-
Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits.
|
141 |
-
|
142 |
-
* Patreon: https://patreon.com/TheBlokeAI
|
143 |
-
* Ko-Fi: https://ko-fi.com/TheBlokeAI
|
144 |
-
|
145 |
-
**Patreon special mentions**: Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.
|
146 |
-
|
147 |
-
Thank you to all my generous patrons and donaters!
|
148 |
-
<!-- footer end -->
|
149 |
# Vicuna Model Card
|
150 |
|
151 |
## Model details
|
|
|
1 |
---
|
2 |
license: other
|
3 |
inference: false
|
|
|
4 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
# Vicuna 13B 1.1 GPTQ 4bit 128g
|
6 |
|
7 |
This is a 4-bit GPTQ version of the [Vicuna 13B 1.1 model](https://huggingface.co/lmsys/vicuna-13b-delta-v1.1).
|
|
|
21 |
**13B models:**
|
22 |
* [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
|
23 |
* [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
|
24 |
+
|
|
|
25 |
**7B models:**
|
26 |
* [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
|
27 |
* [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
|
28 |
+
|
29 |
+
**GGMLs for CPU inference**
|
30 |
+
|
31 |
+
I removed the GGMLs I originally made for Vicuna 1.1 because they were directly converted GPTQ -> GGML and this seemed to give poor results
|
32 |
+
|
33 |
+
Instead I recommend you use eachadea's GGMLs:
|
34 |
+
* [eachadea's Vicuna 13B 1.1 GGML format for `llama.cpp`](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1)
|
35 |
+
* [eachadea's Vicuna 7B 1.1 GGML format for `llama.cpp`](https://huggingface.co/eachadea/ggml-vicuna-7b-1.1)
|
36 |
|
37 |
## How to easily download and use this model in text-generation-webui
|
38 |
|
|
|
114 |
|
115 |
Or just use `vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt` as mentioned above, which should work without any upgrades to text-generation-webui.
|
116 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
117 |
# Vicuna Model Card
|
118 |
|
119 |
## Model details
|
quantize_config.json
DELETED
@@ -1,5 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"bits": 4,
|
3 |
-
"desc_act": false,
|
4 |
-
"group_size": 128
|
5 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
tokenizer.json
DELETED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
CHANGED
@@ -30,4 +30,4 @@
|
|
30 |
"rstrip": false,
|
31 |
"single_word": false
|
32 |
}
|
33 |
-
}
|
|
|
30 |
"rstrip": false,
|
31 |
"single_word": false
|
32 |
}
|
33 |
+
}
|
vicuna-13B-1.1-GPTQ-4bit-128g.latest.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e47a7a68ed4230004e08e83730247625a55cd7493cebadc7be9abf9c3a7275ea
|
3 |
+
size 7255159218
|