Apply chat template function strange behaviour

#14
by rstaruch - opened

messages = [
{"role": "system", "content": "You are helpful assistant"},
{"role": "user", "content": f"User exaple message"},
{"role": "assistant", "content": "Assistant example message"}
]
tokenizer.apply_chat_template(messages, tokenize=False)

The code above returns:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 Jul 2024\n\nYou are helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nUser exaple message<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nAssistant example message<|eot_id|>

Is it intended to include Cutting Knowledge Date and Today Date into chat template?

Meta Llama org

Yes! This is the same behaviour as in the 3.1 instruct models, except the Today Date is automatically taken from the current date unless you override it.

messages = [
{"role": "system", "content": "Cutting Knowledge Date: December 2023. Today Date: 26 August 2024. You are helpful assistant"},
{"role": "user", "content": f"User exaple message"},
{"role": "assistant", "content": "Assistant example message"}
]
tokenizer.apply_chat_template(messages, tokenize=False)

Hope that helps.

@pcuenq would you mind elaborating on this behavior? I don't remember seeing this documented anywhere, and am curious whether removing the date & knowledge cutoff is known to impact downstream performance. Thanks!

(this was brought up in the 3.1 8B repo but never addressed https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/discussions/74)

I found this behavior in Llama 3 70B also, I'm also curious on the reasons behind contaminating the system prompt with this. It is not well documented and results in wrong dates being included as default.

In the case of the 70 B model, the date was added in a commit that was about tool use:
https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/commit/1c865c05dba6e05484ef2cfe03f0b336e377bb0b

No mention of forcing a cutting knowledge date into the prompt. I had to roll back to this version in the meantime...

Sign up or log in to comment