TheDrummer/MS-Interleaved-Upscale-39B · Format + some feedback

2 days ago

•

Hi!

First of all, I would like to thank you for the hard work you put into these models. It is truly appreciated.
I have been trying this model a little bit last night and my first thought was...it's interesting.

The good: It seems to portray characters very well, adding quite a bit of emotional depth to them. It tried to create tension on it's own, really driving home the tumultuous inner struggle of the character, even responding with sobs (not your usual "I have to...sobs...say that...sobs...", but actually with word splits and hyphens "I-just-wah-nt to-say", typical to someone blurting out the words when sobbing, all the while describing the body language of someone in tears). Also, the model seems to have improved logic, at least from my short test. The test character went into a hallway, bumping into a small lamp. After a short interaction (2 messages back and forth) between my character and the model character, it actually remembered to kneel down and pick up the lamp place it correctly (without me mentioning it).

The bad: Just as impressed I was that the character remembered the lamp thing, I was slightly confused when it repeated the same question twice, but in a different way. Ex: The test character said that she likes books and wanted to show me a library. I agreed and said I also love books and had my own library at home. She then proceeds to ask me again how I feel about books in general. I responded yes, and told her my genre of books which I read. The character proceeds to say that she also likes said genre and ask me "But I don't know if you like the genre. What genre's do you normally read?". Something was off there. However, there is a notice in the character description that she is neurotic, so I am not sure how much that is the model and how much that is the card. But, while there is suspense and drama buildup, it just seems a bit that the model is getting stuck in a pattern. A regeneration or a swipe usually helps breaking it, but I just wanted to reference it.

One more thing which I wanted to mention, I am not sure which format to use in ST for this. I used V2&V3 format, as well as V3 alone. Sometimes I noticed some mistakes in the generation of tokens (ex: space appears after the double quotes, random letters get uppercased).I think i had that issue with mistral V3 setting, but I cannot 100% say for sure if it's the format or if it's a setting on my end. So, the question I have for you is: What format do you suggest for this?

TheDrummer

Owner 2 days ago

Hi, I can't recommend or comment on this model since it's not ready for use yet.

If you'd like to try a similar model that's ready for use, try:

GhostGate

2 days ago

So I take it this is more experimental to test the upscaling of the model?

TheDrummer

Owner 2 days ago

This model will serve as a base for finetuning, just like how Tunguska was a finetune of https://huggingface.co/TheSkullery/BA-Zephyria-39b

Zero'd out upscaling requires additional training to repair the layers.

GhostGate

2 days ago

Interesting! Thank you for letting me know.