|
GGUF quants for : https://huggingface.co/alchemonaut/QuartetAnemoi-70B-t0.0001 |
|
|
|
Available : Q3_K_M, IQ3_XXS. |
|
Otw : IQ2_XS |
|
|
|
I recommand you folks to try this model, because it's quite an efficient merge of Miqu, WinterGoddess, AuroraNights, and XWin. |
|
|
|
The Theta Rope 1,000,000 of Miqu, hence the 32k context, is functional up to 16k accordingly to my tests, and probably above (I need a smaller quant to test, which is otw). |
|
|
|
To use it with a quantized KV cache for a higher context, here's KoboldCPP Frankenstein version with several different KV cache quantization lebels to chose from : |
|
https://github.com/Nexesenex/kobold.cpp/releases |