Lower quants
#1
by
Desm0nt
- opened
Hello.
Is there any chances for 2.25 bpw quant? 2.4 is to huge for rope scaling on single 24gb gpu even with cache_4bit
Hello.
Is there any chances for 2.25 bpw quant? 2.4 is to huge for rope scaling on single 24gb gpu even with cache_4bit