Initial GPTQ model commit
Browse files
README.md
CHANGED
@@ -23,20 +23,11 @@ These files are GPTQ 4bit model files for [WizardLM's WizardLM 30B v1.0](https:/
|
|
23 |
|
24 |
It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
|
25 |
|
26 |
-
##
|
27 |
|
28 |
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-30B-GPTQ)
|
29 |
* [4-bit, 5-bit and 8-bit GGML models for CPU(+GPU) inference](https://huggingface.co/TheBloke/WizardLM-30B-GGML)
|
30 |
-
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/
|
31 |
-
|
32 |
-
## Prompt template
|
33 |
-
|
34 |
-
```
|
35 |
-
A chat between a curious user and an artificial intelligence assistant.
|
36 |
-
The assistant gives helpful, detailed, and polite answers to the user's questions.
|
37 |
-
USER: prompt goes here
|
38 |
-
ASSISTANT:
|
39 |
-
```
|
40 |
|
41 |
## How to easily download and use this model in text-generation-webui
|
42 |
|
|
|
23 |
|
24 |
It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
|
25 |
|
26 |
+
## Other repositories available
|
27 |
|
28 |
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-30B-GPTQ)
|
29 |
* [4-bit, 5-bit and 8-bit GGML models for CPU(+GPU) inference](https://huggingface.co/TheBloke/WizardLM-30B-GGML)
|
30 |
+
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/WizardLM/WizardLM-30B-V1.0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
## How to easily download and use this model in text-generation-webui
|
33 |
|