InferenceIllusionist
/

DarkForest-20B-v2.0-iMat-GGUF

Not-For-All-Audiences

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Mar 4

Commit

61b79de

•

1 Parent(s): 9e8a02d

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -15,10 +15,9 @@ After testing the new important matrix quants for 11b and 8x7b models and being
 <b>❗❗Need a different quantization/model? Please open a community post and I'll get back to you - thanks ❗❗ </b>
-Newer quants (IQ3_S, IQ4_NL, etc) are confirmed working in Koboldcpp as of 1.59.1 - if you run into any issues kindly let me know.
-<s>For IQ3_S only: Please offload to GPU completely for best speed.</s> Update: No longer necessary. IQ3_S has been generated after PR [#5829](https://github.com/ggerganov/llama.cpp/pull/5829) was merged.
-This should provide a significant speed boost even if you are offloading to CPU.
 (Credits to [TeeZee](https://huggingface.co/TeeZee/) for the original model and [ikawrakow](https://github.com/ikawrakow) for the stellar work on IQ quants)

 <b>❗❗Need a different quantization/model? Please open a community post and I'll get back to you - thanks ❗❗ </b>
+<i>UPDATE 3/4/24: Newer quants ([IQ4_XS](https://github.com/ggerganov/llama.cpp/pull/5747), IQ2_S, etc) are confirmed working in Koboldcpp as of version <b>[1.60](https://github.com/LostRuins/koboldcpp/releases/tag/v1.60)</b> - if you run into any issues kindly let me know.</i>
+IQ3_S has been generated after PR [#5829](https://github.com/ggerganov/llama.cpp/pull/5829) was merged. This should provide a significant speed boost even if you are offloading to CPU.
 (Credits to [TeeZee](https://huggingface.co/TeeZee/) for the original model and [ikawrakow](https://github.com/ikawrakow) for the stellar work on IQ quants)