Love this model
I am having some great success with this model using it for e.g. simulating calls to functions to parse output and execute them (for drawing pictures, doing math and remembering stuff). But I have noticed that one issue I often have is that if it does a mistake it keeps doing them over and over, indicating that the last messages and its replies (which I also send to it) get so much weigh in its completions that it keeps doing the same mistakes even when I try to correct it. I'd have to remove all past dialoge to "break it free". Been considering detecting when I correct it and at least auto-zap its own wrong reply to my history so, but perhaps there is some parameter I can adjust to make it listen more to my message about its mistake than doing it again?
Also I have noticed it has a tendency to end messages with "Let me know if there's anything else I can help you with!" (and many variations of this) a lot. I have added things like "Do not repeat that you can help" in my system message with varying success. I am currently playing a bit with increasing the frequency_penalty which was default 0. Perhaps that can help. :)
Considering its seemingly weighing the last messages highly I have started moving stuff from the system block down as if the important stuff is closer to completion start, as it had tendencies to ignore the system message stuff after a while. I am experimenting with trimming down the message stack to a minimum although I have had some "dummy" messages in the list first where the assistant is supposedly answering perfectly when it comes to how it should output the function calls to an example user input. It seems to help although I see prompt crafting is a skill on its own.
I am considering playing with system contexts to add and remove info in it depending on where dialogue is going so that I can give it info about other functions available in a given context.
The ability to run these models locally and play with them for free is fantastic though. I am using LM Studio which which has a server mode excellent for this.
I do not think you will be able to reason the model out of it's current responses with the same settings. Have you tried changing the 'system' message to give the model more instruction before asking it questions? The one it is giving sounds very generic. Here is an example from MS Azure docs. https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/openai/includes/chat-markup-language.md :
This one specifically:
<|im_start|>system
Assistant is an intelligent chatbot designed to help users answer their tax related questions.
Instructions:
- Only answer questions related to taxes.
- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information.
<|im_end|>
<|im_start|>user
When are my taxes due?
<|im_end|>
<|im_start|>assistant
If you need to give explicit instructions you may need an instruct tuned model . I really love the Dolphin models, but I decided to try https://huggingface.co/jondurbin/airoboros-m-7b-3.1.2 so I could pass in /INST commands that I couldn't figure out in ChatML, but it works great as one of my agents. Best of luck.
You might wanna try passing a system message plus user message each time
Yes I use the OpenAPI-alike server hosted in LM Studio and pass system message, a few user/assistant mock messages (as if it replied perfectly with regards to outputting function calls), and then a few of the last actual user/assistant pairs (no more than 5), then I actually pass it a generated user message as if I was reminding it of all the things it is supposed to remember (property:value pairs) along with a made up assistant message where it is confirming that this is data it can use in its replies - and then finally the actual user message for assistant completion.
It works rather well actually, its just that once it has tripped up with regards to its function output, those replies seems to be enforced making it do the same mistake over again, no doubt because they are part of the last user/assistant messages in the next completion. So I think I just need some way to tone those last replies down a bit.
Still it's fun experimenting with this and more often than not it is remarkably brilliant in its replies and using the functions just as intended. And to Tom, yes the system message has a clear instructions type, telling it what its role is and a list of the functions it can use and what for, including a sample. I have even a lot of extra bits telling it how it should not be used, which has improved the output somewhat.
The system messages are not part of the conversation, so are you passing the same 'system' input that generated a correct response each time and it is still refusing?
system messages are the same as other messages at the API level (and in ChatML) indeed you can send multiple system messages within a single conversation. By setting role to "system", "user", or "assistant"
Don´t know what to say, but this model is absolutely amazing. Brilliant replies and as a newbie I still discovering this model. Go on with this model - I will hope and enjoy !