ExLlamaV2 looks very cool!
#1
by
julien-c
HF staff
- opened
re. pushing different quant configs on different branches, i'm wondering if it might not be better to push all to main with different filenames? WDYT?
There are advantages to having them on different branches. For instance you can download each version with huggingface-cli download turboderp/Gemma-7B-exl2 --revision=4.0bpw --local-dir .
or some such. And you don't risk people downloading all the weights and confusing the loader, since it considers all .safetensors file in the model dir to be part of the model.
i see! makes sense.