k-quants?

#1
by speedorama - opened

After llama.cpp PR #2148 (https://github.com/ggerganov/llama.cpp/pull/2148), k-quants should be possible on this model.

Yes they are. I'm making k-quants with 32001-vocab models now, like WizardLM 13B v1.1. I've just not gone back to add them to already uploaded models.

I'll try and do it soon.

I've tried making k-quants of my own using the f16 version of this model you provided, but I find that trying to quantize it to 5_K_S causes it to quickly become incoherent.

Sign up or log in to comment