Is it possible to make Midnight-Mixtral?
Hi!
I think Midnight-Miqu is the absolute best RP model around. Would it be possible to make a similar model, but based on Mixtral 8x7B? This would be good because Mixtral is still very smart but also much faster and easier to run than a 70B dense model. Thoughts?
I don't think it's possible to make a Midnight-Mixtral via merging due to major differences between Llama 2 / Miqu 70B and Mixtral's 8x7b architecture. Instead, it might be possible for someone to finetune Mixtral on a high-quality synthetic dataset produced with help from Midnight Miqu, but no such dataset exists yet to my knowledge. I've been kicking around the idea of trying to produce such a dataset myself. If I manage to do it, I'll release the dataset on my HF page and maybe someone will run with it.
That dataset could be amazing, looking forward to it.