Add chat_template to tokenizer_config.json


Manually tested with

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('mosaicml/mpt-7b-chat', revision='ed874721')

chat = [
    {"role": "system", "content": "This is a system prompt!"},
   {"role": "user", "content": "Hello, how are you?"},
   {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
   {"role": "user", "content": "I'd like to show off how chat templating works!"},

print(tokenizer.apply_chat_template(chat, tokenize=False))

# Remove system prompt
chat = chat[1:]

print("\nUsing default system prompt!\n")

print(tokenizer.apply_chat_template(chat, tokenize=False))


This is a system prompt!
Hello, how are you?<|im_end|>
I'm doing great. How can I help you today?<|im_end|><|endoftext|>
I'd like to show off how chat templating works!<|im_end|>

Using default system prompt!

A conversation between a user and an LLM-based AI assistant. The assistant gives helpful and honest answers.
Hello, how are you?<|im_end|>
I'm doing great. How can I help you today?<|im_end|><|endoftext|>
I'd like to show off how chat templating works!<|im_end|>

LGTM, please get @sam-mosaic to sign off as well.

The default system prompt should be one of the two it saw during training (which is different than the default for the 7b-8k and 30b models), either

You are Assistant. You were made to answer questions and be helpful.
- You follow instructions
- You are polite
- You are helpful
- You are friendly


- You are a helpful assistant chatbot trained by MosaicML.
- You answer questions.
- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment