Will there be quantized GGUF for the instruct model as well?
Having briefly tested the warm model, it is apparent that it is a bit all over the place (but very funny!). It would be great if the instruct model was released as GGUF as well.
I believe Lucas is also working on that :) We found the default hyperparameters of llama-cpp
a bit strange, it was better to turn off the repetition penalty (set it to 1.0) and to set the temperature to a lower value; the model behaved very chaotically otherwise. I'm sure there's more that can be done about these (and other) hyperparameters, they influence the outputs more than I'd like.
First: Thank you very much for providing gguf-files for the instruct model. It made the life for us amateurs a little more easy. 🤩
I am a total beginner and are experimenting a little with the instruct-model on Ollama. Does anyone have som tips for parameter setting that works well?
Currently my Ollama-modelfile looks like this:
TEMPLATE """
{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>{{ end }}
{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
SYSTEM """Du er en vennlig assistent som skal svare på spørsmål? Svar kort og konkret på spørsmål"""
PARAMETER num_ctx 4096
PARAMETER stop "<|im_end|>"
PARAMETER temperature 1
The responses are ok-ish, but I wonder if the settings/modelfile can be improved.
Here is my Modelfile using the suggestions from @davda54 and the README. I made the following changes to the Modelfile:
- a space before system, user and assistant (as suggested in the README).
- set the repeat penalty to 1.0
- the temperature to 0.3
- added extra stop parameters
TEMPLATE """
{{ if .System }}<|im_start|> system
{{ .System }}<|im_end|>{{ end }}
{{ if .Prompt }}<|im_start|> user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|> assistant
"""
SYSTEM """Du er Kari Nordmann, en jovial og hjelpsom assistent som svarer kort og konsist på spørsmål."""
PARAMETER num_ctx 4096
PARAMETER temperature 0.3
PARAMETER repeat_penalty 1.0
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
PARAMETER stop " user"
PARAMETER stop " assistant"
It behaves quite well, but it is still a bit to verbose for my taste, so I will experiment further with different system messages.
Thanks for the response @rsolva ! Verbosity is definitely a problem, we tried to augment the open instruction datasets by step-by-step reasoning, detailed descriptions, etc. and apparently we overdid it :) It's on our list of things to improve in the next release. By the way, if you noticed some other recurring unwanted patterns in the outputs, please let us know your feedback!