mradermacher/Llama-Salad-4x8B-i1-GGUF

May 25

Thanks for uploading Quants of my model! I've updated my model card to include links.

Owner May 26

I even tried it out immediately once I had some imatrix quants! Your model card really sold it to me :) In the end... I still found it too horny, to be honest, and it often refused by giving me "end of story" like wrap-up comments, but as far as 8b-based llama3 models go, this one is really good. Please continue making models :)

mradermacher changed discussion status to closed May 26

HiroseKoichi

May 26

•

edited May 26

@mradermacher

For refusals, I found that most of the censorship behavior relates to the predefined roles in both ChatML and Llama-3-Instruct. You can find a bit more information in this PR I made on SillyTavern: https://github.com/SillyTavern/SillyTavern/pull/2148

Essentially, all you do is replace the predefined roles with user and character names and the model opens up a lot more. It also comes with the added benefit of having a clean way of handling names without messing with the prompts. If you have the latest stable release of SillyTavern, you can just select the -Names variant. I haven't had the model try to disengage when using these variants, though I have been using the Q8_0, so I don't know how the Imatrixs work with it.

I'll definitely keep experimenting with more merges in the future.

mradermacher

Owner May 26

I am not using it in chat mode, I am using it to generate stories with a somewhat complex prompt, which might explain the differences in performance.

mradermacher
/

Llama-Salad-4x8B-i1-GGUF

Thank you!