chat prompt?

#4
by apepkuss79 - opened

What is the chat prompt in plain text? Thanks!

Maybe I'm not sure what you mean by plain text. The chat template is not plain text per se as it uses some special tokens to indicate message turns.
You can see the actual template in: https://huggingface.co/ai21labs/AI21-Jamba-1.5-Mini/blob/main/tokenizer_config.json#L185
To convert a messages object to a text representation (which contains also the special tokens) can call apply_chat_messsages() with tokenize=False on the Jamba tokenizer

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("ai21labs/AI21-Jamba-1.5-Mini")
tokenizer.apply_chat_template(messages, tokenize=False)

more info in: https://huggingface.co/docs/transformers/main/en/chat_templating

I mean what the prompt string looks like, for example, the prompt string of llama-3 is shown as below:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Can replicate this chat example for Jamba as:

<|startoftext|><|bom|><|system|> {{ system_prompt }}<|eom|><|bom|><|user|> {{ user_message_1 }}<|eom|><|bom|><|assistant|> {{ model_answer_1 }}<|eom|><|bom|><|user|> {{ user_message_2 }}<|eom|><|bom|><|assistant|>

by calling:

msgs = [
    dict(role="system", content="{{ system_prompt }}"),
    dict(role="user", content="{{ user_message_1 }}"),
    dict(role="assistant", content="{{ model_answer_1 }}"),
    dict(role="user", content="{{ user_message_2 }}"),
]
txt = tokenizer.apply_chat_template(msgs, add_generation_prompt=True, tokenize=False)
print(txt)

however the full chat template is richer than that as it supports various extra features like tools calls, reference documents, citations and more, as can be seen in: https://huggingface.co/ai21labs/AI21-Jamba-1.5-Mini/blob/main/tokenizer_config.json#L185

The chat example is helpful. I'll try it, thanks a lot!

Sign up or log in to comment