Text Generation
GGUF
English
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
story
writing
fiction
roleplaying
horror
general usage
roleplay
neo quant
fantasy
story telling
ultra high precision
Inference Endpoints
imatrix
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -84,21 +84,23 @@ One version is not stronger than the other, they are different and result in dif
|
|
84 |
|
85 |
This chart shows the order in terms of "BPW" for each quant with "IQ1_S" with the least, and "Q8_0" with the most:
|
86 |
|
|
|
87 |
<PRE>
|
88 |
IQ1_S | IQ1_M
|
89 |
|
90 |
-
IQ2_XXS | IQ2_XS
|
91 |
|
92 |
-
IQ3_XXS | Q3_K_S
|
93 |
|
94 |
-
Q4_K_S | IQ4_XS
|
95 |
|
96 |
Q5_K_S | Q5_K_M
|
97 |
|
98 |
Q6_K
|
99 |
|
100 |
-
Q8_0
|
101 |
</pre>
|
|
|
102 |
|
103 |
More BPW mean better quality, but higher VRAM requirements (and larger file size) and lower tokens per second.
|
104 |
The larger the model in terms of parameters the lower the size of quant you can run with less quality losses.
|
|
|
84 |
|
85 |
This chart shows the order in terms of "BPW" for each quant with "IQ1_S" with the least, and "Q8_0" with the most:
|
86 |
|
87 |
+
<small>
|
88 |
<PRE>
|
89 |
IQ1_S | IQ1_M
|
90 |
|
91 |
+
IQ2_XXS | IQ2_XS | Q2_K_S | IQ2_S | Q2_K | IQ2_M
|
92 |
|
93 |
+
IQ3_XXS | Q3_K_S| IQ3_XS | IQ3_S | IQ3_M | Q3_K_M | Q3_K_L
|
94 |
|
95 |
+
Q4_K_S | IQ4_XS | IQ4_NL | Q4_K_M
|
96 |
|
97 |
Q5_K_S | Q5_K_M
|
98 |
|
99 |
Q6_K
|
100 |
|
101 |
+
Q8_0
|
102 |
</pre>
|
103 |
+
</small>
|
104 |
|
105 |
More BPW mean better quality, but higher VRAM requirements (and larger file size) and lower tokens per second.
|
106 |
The larger the model in terms of parameters the lower the size of quant you can run with less quality losses.
|