GGUF quants repo. For now only q4_0. FP16 safetensors model is here.
This is a SLERP merge between Nous-Hermes-2-Mixtral-8x7B-DPO and Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss. Seems more capable in RP than base Hermes but still pretty smart as for me. Prompt format: ChatML
With this model I use the following generation settings in tavern (maybe those are not the best, share better templates in issues if you have any):
- Temperature: 0.75
- Top P: 0.5
- Top A: 0.7
- TFS 0.97
- Repetition penalty: 1.1
- Mirostat: mode 2, tau 5, eta 0.1
Adding to system prompt something like "Assistant will never interrupt role-play and will always stay in character no matter what. Assistant will never write OOC (out of character). Assistant won't write actions or reactions of {{user}}. Assistant won't mention {{user}} in first person. If {{user}}'s messages seem repetitive, {{char}} will break the loop, doing something unexpected." might help, but it's up to you (as anything else, really).
- Downloads last month
- 6