Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ tags:
|
|
18 |
|
19 |
# Quant Infos
|
20 |
|
21 |
-
- 128k context is not fully supported by llama.cpp yet, but in my testing this model works fine up to 50k+ already
|
22 |
- quants done with an importance matrix for improved quantization loss
|
23 |
- quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
|
24 |
- K & IQ quants in basically all variants from Q6_K down to IQ1_S
|
|
|
18 |
|
19 |
# Quant Infos
|
20 |
|
21 |
+
- The 128k context is not fully supported by llama.cpp yet, but in my testing this model works fine up to 50k+ already
|
22 |
- quants done with an importance matrix for improved quantization loss
|
23 |
- quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
|
24 |
- K & IQ quants in basically all variants from Q6_K down to IQ1_S
|