Model Feedback
Your recipe seems to be working - this model produces a different, much more human like response. In eRP and RP - and it also tries to adhere to card instructions. There is a percieved safety bias, still (using silly tavern - multiple "Chat Completion Presets" and different jailbreaks), but that might just be me or my approach. I've done only limited testing.... but this model approach is working! Ofc, with 12 GB VRAM available I would welcome an increase in parameters - as some older 13b models produced a much more nuanced response. But still, very good overall for a "mere" 8b one. =)
But i'm getting strong Athena-v4 vibes, in terms of creativity and "lucidness". It's much more kinky, tho.
Thank you for putting up with this. I hope my anecdotal evaluation of this model is of any use to you. Oh, I was using the 8bit quantized variant from mradermacher (L3-Nymeria-v2-8B-GGUF), WITHOUT iMatrix. I suspect the iM permutation will be slightly more coherent and sane. Have a good one!
Feedback is always nice. I felt that Nymeria was difficult to make, although I just needed to find the right balance between the two different models to get the output I wanted. But oh boy, same with 5 different models... v2 was behind a ridiculous amount of work and testing. It took 300+ slerp configs to get it right.
Ofc, with 12 GB VRAM available I would welcome an increase in parameters - as some older 13b models produced a much more nuanced response. But still, very good overall for a "mere" 8b one. =)
There is this 15B Nymeria v1 made interleaving layers of the 8B model: https://huggingface.co/mradermacher/L3-Nymeria-15B-i1-GGUF
15B is probably broken. Zeroed and cloned layers. There's nothing in those extra layers that's isn't already in 8B. New layers needs to be retrained with new data to make it work.