When asked what I use locally on a 24GB card, this is what I point to. I favor exl2s for long context, GGUF for very short context.
-
Downtown-Case/Qwen_Qwen2.5-32B-Base-exl2-3.92bpw
Text Generation • Updated • 12 • 1 -
Downtown-Case/Qwen_Qwen2.5-32B-Base-exl2-3.62bpw
Text Generation • Updated • 6 • 1 -
Downtown-Case/Qwen_Qwen2.5-32B-Base-exl2-3.75bpw
Text Generation • Updated • 9 • 1 -
Downtown-Case/Star-Command-R-Lite-32B-v1-exl2-4bpw
Text Generation • Updated • 21 • 1