Feedback/Review
Will do a more complete review in the future, I'm just giving some first impressions. So far, model's decent. Personality's good so far and responses keep on stopping before reaching maximum token limit occasionally despite banning eos token. My max token was set to 300 and responses kept on ending on 200-220 tokens. Also advanced formatting leakage. I think there are also traces of word or response repetition, not sure about that part.
Preset: Temp: 1.02, smoothing: 0.06, min_p: 0.11-0.12 (Creativity focused, experimental):
Yeah, it seems decent enough, about bleeding - I've observed it also using models from noromaid familly, for me adding '### Instruction (....)' to stop strings works in sillytavern(aplaca presets). BTW, i've added quants for 24GB:https://huggingface.co/TeeZee/NEBULA-23.8B-v1.0-bpw7.5-h8-exl2, Colab: https://huggingface.co/TeeZee/NEBULA-23.8B-v1.0-bpw4.5-h8-exl2 and 12GB: https://huggingface.co/TeeZee/NEBULA-23.8B-bpw3.65-h8-exl2. Its 23.8B model, based on mistral using just a little bit more VRAM than 20B llama based models. I'm still testing it (and having a blast with lovecraftian RPG story).
Despite very small NSFW datasets for finetunig its doing good, so at least I've got now, working hyperparameters for small datasets ready for future projects ;). Thanks again for feedback, its greatly appreciated :).
for me adding '### Instruction (....)' to stop strings works in sillytavern(aplaca presets)
You could use logit bias instead of stopping strings, so gens don't stop prematurely. Anyways, thnx for making additional quants. Also I've noticed the model to heavily follow the example dialogue in desc. Though, my pet peeve with this is that if example dialogue is short then responses would be short regardless of high token count
Hello, In a moment I'll give a short complete review of what I think of this model so far. Here's my rentry page for setup info and other tidbits:
https://rentry.co/Clevyby
I'll use exllama_hf and the model from here
Now usually I'd use my Pandora's Box sampler settings but after many hours of testing in trying to find the right variant, I just can't because of the model being somehow wildly unstable in responses which is strange because it worked fine with the DarkForest series. At some point it got too deterministic for my tastes, so I just went with a simple Temp: 1 and min_p: 0.06 just to see what responses would normally be.
During testing, I found that there are many issues that are quite similar to what I mentioned in regards to DarkForest V1, they are but not limited to:
- Author Note leakage. Brackets in responses made a comeback.
- Xml tag from Description leakage. The xml tags just informing the model that there are example messages.
- Advanced Formatting leakage. Also made a comeback, very annoying when continuing messages. I suspect this is from the instruct part as disabling it makes it disappear.
- Short responses: Probably a one time thing, they range from two sentences to one and a half paragraphs.
- OOC tendencies: Model likes to inform me, the user a lot of things about {{char}}.
- Second person confusion: In some responses, model gets confused with the principle of 'you'
- Samey word tendencies: I use smoothing with many tokens for the model to consider, despite that the model uses this when in low temp range.
Besides that here are the responses using Temp: 1, Min_P: 0.06 and using Temp Last:
Verdict: Overall, model's quite decent in all aspects, however it needs more fine tuning and improvements before it could be unique in its own league.