GPTQ/AWQ quant that is runable in vllm?
#4
by
Light4Bear
- opened
@LoneStriker can you please make a GPTQ or AWQ 4bit 128g quant of this?
I do not believe my machines have enough resources to generate GPTQ or AWQ versions of this model unfortunately. If I get access to a bigger box, I'll add these to the list.
@LoneStriker I tried to do gptq quant myself, but my 22G vram card came a few GBs short, oomed at the 52th/80 layer. So I thought a 24G card might be just enough. Anyway, @titan087 has made it https://huggingface.co/titan087/Liberated-Qwen1.5-72B-4bit thanks.