May I ask why the GPTQ version is slow
#20
by
lynngao815
- opened
Thank you very much for your effort with this GPTQ version! It is more convenient using your model with a consumer GPU and more affordable to do fine-tuning. However I am a lit bit confused about why the 4bit version is slower? I am not very familiar with those computer science fundamentals but is it supposed to be faster if using lower precisions? Really appreciated if someone could explain this to me!