Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Inference Endpoints

Lack of chat_template in tokenizer_config.json

#4
by Annorita - opened

There is no chat_template in token_config.json, which is used by tokenizer.apply_chat_template().
Since this model is based on deepSeek, I assume that the chat_template is similar to deepSeek's template with a bit modification.
I try to write the chat_template for this model based on deepSeek and magicCoder README, may I ask your help to confirm if this chat template is correct?

chat_template  = "{%- set ns = namespace(found=false) -%}\n{%- for message in messages -%}\n    {%- if message['role'] == 'system' -%}\n        {%- set ns.found = true -%}\n    {%- endif -%}\n{%- endfor -%}\n{{bos_token}}{%- if not ns.found -%}\n{%- endif %}\n{%- for message in messages %}\n    {%- if message['role'] == 'system' %}\n{{ message['content'] + '\\n\\n' }}\n    {%- else %}\n        {%- if message['role'] == 'user' %}\n{{'@@ Instruction\\n' + message['content'] + '\\n\\n'}}\n        {%- else %}\n{{'@@ Response\\n' + message['content'] + '\\n' + eos_token + '\\n'}}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}{% if add_generation_prompt %}{{ '@@ Response\n' }}{% endif %}"

Usage:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)
tokenizer.chat_template = chat_template 

conversation_inference =[
    {"role": "system", "content": "You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable responses to user instructions."},
   {"role": "user", "content": "Please write a sorting algorithm in Python"},
]

#only show the templated result
inputs = tokenizer.apply_chat_template(conversation_inference, tokenize=False, add_generation_prompt=True)
print(inputs)

#use it for inference
inputs = tokenizer.apply_chat_template(conversation_inference, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=2048, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

reference:

  1. deepseek: https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct/blob/main/tokenizer_config.json
  2. MagicCoder README: https://github.com/ise-uiuc/magicoder#-quick-start
Intellligent Software Engineering (iSE) org

Hi @Annorita , thanks for raising this issue. The chat_template you created seems to work very well! Would you like to a create pull request?

Sure! I'm glad to do that.

Annorita changed discussion status to closed

Sign up or log in to comment