May I ask why the GPTQ version is slow

#20

by lynngao815 - opened Jul 31, 2023

Jul 31, 2023

Thank you very much for your effort with this GPTQ version! It is more convenient using your model with a consumer GPU and more affordable to do fine-tuning. However I am a lit bit confused about why the 4bit version is slower? I am not very familiar with those computer science fundamentals but is it supposed to be faster if using lower precisions? Really appreciated if someone could explain this to me!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment