What about the GGUF quants ?
#1
by
BernardH
- opened
I was surprised to see GGUF quants from 7 months ago, considering the llama.cpp support for t5 just landed. Are these supposed to work with llama.cpp ?
Are there any evaluations of the performance loss incurred by quantization ?
Thanks for the models !
Best Regards