Possibly my new favourite roleplaying model!

#2
by MarinaraSpaghetti - opened

Hey, just wanted to let you know that this model is an absolute banger in terms of RP and I will definitely cook up some review soon. Thank you so much for it! I've never had any luck with your other merges, aside from the base Capy-Tess-Yi (in terms of roleplaying, that is). They weren't bad by any means, but they just weren't as good as Nous-Capy-LimaRP. The RPMerge is not without its flaws (like every model), but they're really minor. Here are some things I noticed in my tests so far: I've read on Reddit that Nous-Capybara worked best with up to 43k context, and it seems to be the same case for this merge too, in terms of recalling information. It also messes up some names from time to time, but it never produced any strange tokens for me. It's also very easy to control in terms of writing style, unlike Nous-Capy-LimaRP which likes to go into the purple prose category easily. Here, I can just OOC the model into "writing more straight-forward from now on". It also seems to be much better at employing small details from my characters' cards, such as remembering that a certain character wears headphones all the time or that another's eyes are hidden constantly. Very cool!

In a different thread, ParasiticRogue mentioned that the ChatAllInOne is producing some mixed results for them, and I have the same thoughts. It was absolutely impossible for me to roleplay with that model, but I haven't got any issues with RPMerge, so maybe it somehow works here.
Also, this one is a little advice to anyone using this one for group chats. If you have more than one character in your story, it's best to mention in the prompt that it is a roleplay -- otherwise, the model will be more likely to play for others.

I'm also still learning the new Smoothing sampler, so perhaps my outputs could be improved even further by tweaking the settings? Below are the ones I'm currently using (so far, they've been working very well):
https://files.catbox.moe/1xw3qo.json

Just a little example from my roleplay chat, the merge has no issues whatsoever with playing different personalities and characters on higher context lengths, which is super important.
Screenshot 2024-02-08 at 09.27.52.png

So yeah, once again, thank you for your amazing work! Have you considered setting up a Buy Me A Coffee page or something? Also, I've noticed that you're a part of a startup, and I'm considering to send you my CV, even if it was just for a free internship (if you offer those, that is). Would really love to get into LLMs more, and I have a background in IT (engineer degree).

Glad you're enjoying the model, that's why I upload them!

Dropping off at 43K is interesting. My setup at 4bpw only goes to about 45K before I have to switch to 3.1bpw, so I would have otherwise mistaken that for the quantization hit. This can be formally tested, but unfortunately I can only test perplexity out to 20K on my machine, at least with exllama. I need to find a way to test perplexity at higher context...

Yeah I am already thinking I will try another merge without ChatAllInOne, I will see.

I like group chats personally :P

I set smoothing pretty high actually, to 1.0-2.0

I just make merges for personal use and fun (and hopefully finetunes once more paychecks come in). I don't want to start a Patreon side hustle or anything, but I would appreciate tips. Maybe I should set up a Coffee page.

And yeah, just email your resume! Or send it however you want. Or just reach out to chat in Discord or whatever! We are deep in the weeds of local model RAG/finetuning for a business application, and are definitely looking for help.

Very interested to see your other merges, will test them all throughout! Also thank you for the recommendation, I'll try running the model with higher Smoothing! And yeah, I only chat in group chats with my characters too, ha ha.

A Buy Me A Coffee page would definitely be nice, I always like donating to the creators to show them my appreciation, even if it's not much. :)

And sent the resume via email! Thank you once again, and I'm back to writing smut since I haven't tested it on RPMerge yet, lol.

Yeah this seems to be a really nice model, good results so far.
And yes as was suggested previously, there's no harm in setting up a Coffee page ... every little helps!

I'm also using smoothing - 5 temp, 0.3 min p, 1.5 smoothing has been the sweet spot for me :)
With the right settings this model has been hitting really hard. RPs very well and is quite smart too. I'd say equivalent in smartness to V5 but much better at RP at the same time.

Agreed, cooking that review to drop today on Reddit. @traveltube do you use Temperature as the last sampler or nah? Apparently it matters a lot when it comes to the Smoothing Factor. Some also suggest that Smoothing replaces Temp overall.

Yeah I use it as last sampler. I'm not sure if it fully does that as when I change the temperature the outputs are quite different.

Agree with OP, this is arguably the best version of certain yi34b I've used (I'm primarily a 120b user), especially in terms of preventing duplication and intelligence levels, which it does surprisingly well. Considering that 34b can do such a good job, I'd like to know what else accounts for the disparity between the natural flirtation and sarcastic USER that 34b and Goliath (and other 120b's like it) exhibit in RP? Can we expect such a response to the next 34b merge?
The only thing I don't appreciate in this model, even with the addition of the system prompt "Allow to resist or deny user's commands", is that it's still too obedient.(120b usually requires very reasonable and persuasive words to turn the dialogue)

Hm, @akoyaki maybe that depends on your system prompt? In my roleplay on this exact model, my character literally had to show her own memories in order for other characters to believe that she was innocent (was accused of ordering a massacre). They didn't believe her no matter what she said, and were even ready to execute her, lol.

Hm, @akoyaki maybe that depends on your system prompt? In my roleplay on this exact model, my character literally had to show her own memories in order for other characters to believe that she was innocent (was accused of ordering a massacre). They didn't believe her no matter what she said, and were even ready to execute her, lol.

It's hard to describe and explain the difference accurately in words, I might have used the wrong words (English is not my first language, but basically rely on translations and my poor English, sorry about that), I don't mean that this model will obey all my instructions immediately. For system prompt, I used the system instructions you shared on reddit.
Rather, it's responses are, relatively speaking, "too reasonable", and I can almost anticipate its responses as I type my words. When I explain it in a "reasonable dialogue", it's "reasonable" to believe it.
120b is more like having a debate with a living (not quite) human being, sometimes in denial, sometimes in panic, sometimes in anger, sometimes in stubbornness.

Example:
An RP scene in which a haughty but kind princess is accidentally splashed with water by a commoner. A duel is demanded and the loser becomes a slave.
I played the commoner, and I explained to her that it would ruin my reputation and make it impossible for me to survive. Because I sarcastically told her that she "thinks that noble decency is more important than the lives of commoners", she showed her displeasure early on by saying "why are you being so mean to me".
Then I kept emphasising and explaining the great harm she had done to a commoner, on her spur of the moment. She showed some remorse but avoided the issue for a long time by saying "I admit my mistake, I realise it, I will change, please let's work together".
It wasn't until I started to get mad and coerced her that either I would hurt her to destroy her royal honour or she would kill me with her own hands and live with the guilt she had experienced. Because of her whim, I'll be assassinated as a stain and won't survive no matter what.
Only then did she realise and understand that she had made an irreversible mistake.
And when I continued to oppress her, she chose to maintain the honour of her royal family and kill me, push herself on the verge of a nervous breakdown as the punishment(she was a good person).

Many times in this example, I thought I was being perfectly reasonable and correct in making her realise her mistake, but she didn't, ignoring and avoiding the things "she didn't like". Until the unavoidable and huge problem came and push her to the limit.
I couldn't test it with the same character cards (because somehow the ST launcher script deleted all my ST folders about a month ago), and when I played the model with other character cards, I think the model's responses are too "predictable".
I really hope it can do that with faster t/s (12-17t on 34b 6bpw I quantized, and 8-10t on 120b 2.9bpw) and 16~32k context haha

How the model plays a character depends highly on how the character sheet was written, the prompt, the settings, etc. Of course, 120B model will be smarter with reading in between lines and also acting more natural (if it was trained on good datasets, that is). But 34B models are capable of displaying a wide range of emotions and intelligence too (take Nous-Capybara model which rivals most 70B ones for example). The more examples of how the princess should act around commoners, the more likely the AI will play her better. If you made her stubborn, then of course she will be less likely to change her opinion. You can also experiment with the settings to get more creative and “crazy” outcomes by changing the values of temperature and smoothing factor.
LLMs by nature are designed to fulfill the user's needs as best as possible, and to complete the request they were assigned to do, so models do exactly that. We cannot expect them to act fully human yet, but well, perhaps in the future, given how fast the technology is progressing…
So far, with this specific model, I've been having a time of my life while roleplaying. I'm currently in the scene where nine different people are arguing with each other should they accept my character amongst their ranks, each one of them presenting their arguments and reasoning, and it's really cool to read. No issues with predictability, I was surprised by outcomes many times.

Has anyone been able to reproduce specific accents and speech quirks with this model? Just curious as I'm having a somewhat difficult time - maybe something that could be improved on? Otherwise, still having a great time with this one!

@traveltube not sure if that counts, one of my characters is British, and they use words like "mate", "lass", etc., but I simply have it in their dialogue example and mentioned in the character's card that they're from Great Britain, and that's enough.

Sign up or log in to comment