Performance of this Quant

by ernestr - opened Aug 13

Aug 13

Hey,

Thanks very much for quantizing this model! Downloading tonight. Are you able to provide any feedback on it's performance over the GGUF. Did you see any issues with performance or coherence?

denru

Owner Aug 13

If you can load the entire model onto GPUs, based on my limited experiences, EXL2 is always much faster than GGUF. I found this model seems to be slightly better than the original mistral model. It is not surprising because the Tess model is coming from the legendary creator of Synthia models, who I pretty respect.

However, this Tess model is extremely sensitive to the prompt format. Make sure you are using the one provided in the model card. Otherwise, it will generate gibberish.

Enjoy!

denru changed discussion status to closed Aug 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment