Thank you for providing the imatrix file

#1
by froggeric - opened

Thank you for doing such a great work on producing those files. I especially appreciate you uploading your imatrix file as well, which makes it possible for anyone (impatient) to quantise the original file themselves.

I will be evaluating both q4_k_s and q5_k_s variants. Based on my previous tests with imatrix quants, I found q4_k_s benefits a lot from it. However, to be honest, the smaller quants, althought a nice technological achievement, lose too much smartness to be useful to me.

I'd love to hear of your results. And yes, the q4 quants are generally very good, with very little loss, while with q3 or even iq3 it quickly degrades. Although I admit I often run iq3 models when they just fit into my setups vram. The model itself makes a far bigger difference than the q4 vs. q3 quants in my experience (and I think the larger the model, the more lower quants work, almost certainly because the optimum for these larger models hasn't been reached yet).

That's why I generally trys to provide all (or most) quants, so people can choose and possibly run some models they otherwise simply couldn't run.

Sign up or log in to comment