Repetitiveness
Sorry if my english is bad.
I am using the Q5_K_M version and this model is doing extremely good with low context, but it repeat itself more and more as context gets closer and closer to 6k tokens. And when it reaches this limit it starts repeating itself a lot, even when it's out of context (it copy pastes the previous responses even when it doesn't relate to the last input), and it's no longer creative.
Even with a repetition penalty of 1.5, I cannot get rid of the repetitions.
Is this a problem of Mixtral models in general?
Hello, it's a problem of Mixtral in general AND our model can make it worse in some way if you don't adapt your setting/prompting.
I'm also still not sure if K quant work correctly, could you please use Q5_0 and see if that problem is still too much present?
If others users was able to use it properly, you will be able too haha, let's see why it doing that.
I'm so stupid I just realised that I was using "Noromaid-v0.1-mixtral-8x7b.q5_k_m.gguf" and not "Noromaid-v0.1-mixtral-8x7b-Instruct-v3.q5_k_m.gguf". I downloaded the "Instruct" version and it doesn't repeat itself, or at least with the few tests I made.
I'm so stupid I just realised that I was using "Noromaid-v0.1-mixtral-8x7b.q5_k_m.gguf" and not "Noromaid-v0.1-mixtral-8x7b-Instruct-v3.q5_k_m.gguf". I downloaded the "Instruct" version and it doesn't repeat itself, or at least with the few tests I made.
Glad the issue what somewhat fixed then, report if you see others issueswith the Instruct model!