Prompt format

#4
by sam-paech - opened

Was this trained with a specific prompt format?

<|system|>
You are a helpful assistant.</s>
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
Show me how to build a website in 10 simple steps</s>
<|assistant|>

Thanks!

Hugging Face H4 org

The Jinja chat template is also part of the tokenizer if you need it: https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/blob/4e5568b3b7428916cc30b38c94b282707ee5a48e/tokenizer_config.json#L32

Does the text-generation pipeline automatically apply the tokenizer's chat template when used as per the example code?

I thought it needed to be applied with tokenizer.apply_chat_template, but maybe I missed the memo.

Hugging Face H4 org

To be specific, a chat template is applied if the input looks like a chat in the style of the OpenAI API (i.e. a list of dicts with role and content keys). If you pass a single string, the pipeline won't try to apply a chat template to it.

Sign up or log in to comment