Doctor-Shotgun
commited on
Commit
•
63254b0
1
Parent(s):
c8fa24a
Update README.md
Browse files
README.md
CHANGED
@@ -15,10 +15,31 @@ datasets:
|
|
15 |
|
16 |
Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.
|
17 |
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
[Peft Adapter](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora)
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
## Usage:
|
23 |
The intended prompt format is the Alpaca instruction format of LimaRP v3:
|
24 |
```
|
@@ -45,6 +66,7 @@ Character: {utterance}
|
|
45 |
|
46 |
(etc.)
|
47 |
```
|
|
|
48 |
|
49 |
## Message length control
|
50 |
Due to the inclusion of LimaRP v3, it is possible to append a length modifier to the response instruction sequence, like this:
|
|
|
15 |
|
16 |
Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.
|
17 |
|
18 |
+
My current generation settings are:
|
19 |
+
```
|
20 |
+
Temperature: 1.25
|
21 |
+
Min-p: 0.05
|
22 |
+
Repetition penalty: 1.05
|
23 |
+
Repetition penalty: range 1024
|
24 |
+
```
|
25 |
+
And this seems to avoid the Mixtral looping pitfalls for me so far. Play around with it and see what works well for you.
|
26 |
|
27 |
[Peft Adapter](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora)
|
28 |
|
29 |
+
Quants courtesy of TheBloke:
|
30 |
+
- [GPTQ](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GPTQ)
|
31 |
+
- [GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF)
|
32 |
+
- [AWQ](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-AWQ)
|
33 |
+
|
34 |
+
Exl2 Quants courtesy of LoneStriker:
|
35 |
+
- [2.4bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-2.4bpw-h6-exl2)
|
36 |
+
- [3.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.0bpw-h6-exl2)
|
37 |
+
- [3.5bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.5bpw-h6-exl2)
|
38 |
+
- [3.75bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.75bpw-h6-exl2)
|
39 |
+
- [4.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-4.0bpw-h6-exl2)
|
40 |
+
- [5.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-5.0bpw-h6-exl2)
|
41 |
+
- [6.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-6.0bpw-h6-exl2)
|
42 |
+
|
43 |
## Usage:
|
44 |
The intended prompt format is the Alpaca instruction format of LimaRP v3:
|
45 |
```
|
|
|
66 |
|
67 |
(etc.)
|
68 |
```
|
69 |
+
My current templates have been uploaded to a [folder](https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss/tree/main/ST%20Templates).
|
70 |
|
71 |
## Message length control
|
72 |
Due to the inclusion of LimaRP v3, it is possible to append a length modifier to the response instruction sequence, like this:
|