EXL2 quants of Mistral-7B-OpenOrca
Converted from Mistral-7B-OpenOrca. This is a
straight conversion, but I have modified the config.json
to make the default context size 7168 tokens, since in
initial testing the model becomes unstable a while after that. It's possible that sliding window attention will
allow the model to use its advertised 32k-token context, but this hasn't been tested yet.
2.50 bits per weight
2.70 bits per weight
3.00 bits per weight
3.50 bits per weight
4.00 bits per weight
5.00 bits per weight
6.00 bits per weight