qwp4w3hyb
/

Phi-3-mini-128k-instruct-iMat-GGUF

Text Generation

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on Apr 26

Commit

db216d2

•

1 Parent(s): a2f29a8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ tags:
 # Quant Infos
-- 128k context is not fully supported by llama.cpp yet, but in my testing this model works fine up to 50k+ already
 - quants done with an importance matrix for improved quantization loss
 - quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S

 # Quant Infos
+- The 128k context is not fully supported by llama.cpp yet, but in my testing this model works fine up to 50k+ already
 - quants done with an importance matrix for improved quantization loss
 - quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S