Missing info
Hey, thanks for using the new better quants on this model AND providing the imat :)
You however dont say how the imat was generated, on which dataset/how many chunks etc.
Also it would be nice to know if you used q8_0 or full f16 for that process.
also, there where very recently added more iquants that would be very interesting to me, i am specifically looking at replacing q3_k_m with iq3_m.
see https://github.com/ggerganov/llama.cpp/pull/5676 for iq3_s and iq3_m
I'm currently in the process to re-add some more quants to existing repos, but my pipeline is very deep so it can take a long time (1-2 weeks). All my recent quants already have iq3_s/m, most of which are in the process of being uploaded.
The imatrix for all models not otherwise stated was made using 164k semi-random english-only tokens, from a non-public data set, usually from an Q8 quant.
Thanks for clarifying :)