Prompt format
Was this trained with a specific prompt format?
<|system|>
You are a helpful assistant.</s>
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
Show me how to build a website in 10 simple steps</s>
<|assistant|>
Thanks!
The Jinja chat template is also part of the tokenizer if you need it: https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/blob/4e5568b3b7428916cc30b38c94b282707ee5a48e/tokenizer_config.json#L32
The Jinja chat template is also part of the tokenizer if you need it: https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/blob/4e5568b3b7428916cc30b38c94b282707ee5a48e/tokenizer_config.json#L32
Does the text-generation pipeline automatically apply the tokenizer's chat template when used as per the example code?
I thought it needed to be applied with tokenizer.apply_chat_template, but maybe I missed the memo.
Yeah the pipeline now does this automatically! https://github.com/huggingface/transformers/blob/caa5c65db1f4db617cdac2ad667ba62edf94dd98/src/transformers/pipelines/text_generation.py#L253
To be specific, a chat template is applied if the input looks like a chat in the style of the OpenAI API (i.e. a list of dicts with role
and content
keys). If you pass a single string, the pipeline won't try to apply a chat template to it.