Can this version be loaded with vllm?
#1
by
wawoshashi
- opened
Can this version be loaded with vllm?
It doesn't look like vllm supports EXL2(ExllamaV2) quantization quite yet: https://github.com/vllm-project/vllm/issues/3203
Dracones
changed discussion status to
closed
Can be loaded in TabbyAPI, A100 80G.