Mid-range models exl2 quants
Collection
Exl2 quants of mid-range (20-40B) LLM models. Usually around 4-5 BPW
β’
8 items
β’
Updated
This is a 4.5bpw EXL2 quant of TheDrummer/Star-Command-R-32B-v1
This quant was made using exllamav2-0.2.0 with default dataset.
I tested this quant shortly in some random RPs (including 8k+ RPs where remembering and understanding specific facts in the context is needed) and it seems to work fine. In my short tests it seemed better than GGUF of similar size (Q4_K_M).
This quant fits nicely in 24GB VRAM, especially with Q4 cache.
Uses Command-R format.
BeaverAI proudly presents...
An RP finetune of Command-R-8-2024