Want to ask a general question
hi, can i ask a general question? what is the main different of GGUF models and it's original models? what is the pros and cons?
The GGUF versions are smaller than the larger original models and will fit in lower VRAM requirements.
Cons - speed might be slower in some cases+ the lower you go in the Q, the lower the quality.
The GGUF versions are smaller than the larger original models and will fit in lower VRAM requirements.
Cons - speed might be slower in some cases+ the lower you go in the Q, the lower the quality.
And GGUF vs TURBO ones?
Different implementation. The turbo model will generate an image in 5-8 steps, but will be qualitatively lower. The inference speed overall will be faster.
Note there are GGUF Turbo models too (as in this repository).
GGUF will be slower overall compared to an NF4 (that includes the CLIP encoders) or original checkpoint.