Model breaks at full context
This model holds so much potential, but sadly, it breaks for me. I’m using the 6.5 quant at full 32k context and it spews nonsense (like repeating one letter, for example). This happens with basically all MOE models aside from the basic Mixtral Instruct. Has anyone else faced the same issue? I’m using Oobabooga for loading, SillyTavern as frontend and I only use Temperature and Min P to control the output. Thank you in advance for help!
Saw I assume your post on Reddit, looks like even though they set the config.json value to 32k it might have been a stretch, the base models are all much lower so it's odd they'd push it so far with the merge.. shame it's not working out for huge context! Wish I could provide further help but I think it's just not meant for it
Yes, that was my post! The authors of the model reached our to me on Reddit and let me know that the context for the model is 8k - they will add this to their model card info. Thank you for your reply regardless, super sweet of you!