Unify chat templates

#10
by Rocketknight1 HF staff - opened
Files changed (2) hide show
  1. README.md +44 -0
  2. tokenizer_config.json +1 -10
README.md CHANGED
@@ -150,6 +150,50 @@ The stock fundamentals data for Tesla (TSLA) are as follows:
150
  This information provides a snapshot of Tesla's financial position and performance based on the fundamental data obtained from the yfinance API. It shows that Tesla has a substantial market capitalization and a relatively high P/E and P/B ratio compared to other stocks in its industry. The company does not pay a dividend at the moment, which is reflected by a 'Dividend Yield' of 'None'. The Beta value indicates that Tesla's stock has a moderate level of volatility relative to the market. The 52-week high and low prices give an idea of the stock's range over the past year. This data can be useful when assessing investment opportunities and making investment decisions.<|im_end|>
151
  ```
152
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
  ## Prompt Format for JSON Mode / Structured Outputs
154
 
155
  Our model was also trained on a specific system prompt for Structured Outputs, which should respond with **only** a json object response, in a specific json schema.
 
150
  This information provides a snapshot of Tesla's financial position and performance based on the fundamental data obtained from the yfinance API. It shows that Tesla has a substantial market capitalization and a relatively high P/E and P/B ratio compared to other stocks in its industry. The company does not pay a dividend at the moment, which is reflected by a 'Dividend Yield' of 'None'. The Beta value indicates that Tesla's stock has a moderate level of volatility relative to the market. The 52-week high and low prices give an idea of the stock's range over the past year. This data can be useful when assessing investment opportunities and making investment decisions.<|im_end|>
151
  ```
152
 
153
+ ## Chat Templates for function calling
154
+
155
+ You can also use chat templates for function calling. For more information, please see the relevant section of the [chat template documentation](https://huggingface.co/docs/transformers/en/chat_templating#advanced-tool-use--function-calling).
156
+
157
+ Here is a brief example of this approach:
158
+
159
+ ```python
160
+ def multiply(a: int, b: int):
161
+ """
162
+ A function that multiplies two numbers
163
+
164
+ Args:
165
+ a: The first number to multiply
166
+ b: The second number to multiply
167
+ """
168
+ return int(a) * int(b)
169
+
170
+ tools = [multiply] # Only one tool in this example, but you probably want multiple!
171
+
172
+ model_input = tokenizer.apply_chat_template(
173
+ messages,
174
+ tools=tools
175
+ )
176
+ ```
177
+
178
+ The docstrings and type hints of the functions will be used to generate a function schema that will be read by the chat template and passed to the model.
179
+ Please make sure you include a docstring in the same format as this example!
180
+
181
+ If the model makes a tool call, you can append the tool call to the conversation like so:
182
+
183
+ ```python
184
+ tool_call = {"name": "multiply", "arguments": {"a": "6", "b": "7"}}
185
+ messages.append({"role": "assistant", "tool_calls": [{type": "function", "function": tool_call}]})
186
+ ```
187
+
188
+ Next, call the tool function and append the tool result:
189
+
190
+ ```python
191
+ messages.append({"role": "tool", "name": "multiply", "content": "42"})
192
+ ```
193
+
194
+ And finally apply the chat template to the updated `messages` list and `generate()` text once again to continue the conversation.
195
+
196
+
197
  ## Prompt Format for JSON Mode / Structured Outputs
198
 
199
  Our model was also trained on a specific system prompt for Structured Outputs, which should respond with **only** a json object response, in a specific json schema.
tokenizer_config.json CHANGED
@@ -2058,16 +2058,7 @@
2058
  }
2059
  },
2060
  "bos_token": "<|begin_of_text|>",
2061
- "chat_template": [
2062
- {
2063
- "name": "default",
2064
- "template": "{{bos_token}}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"
2065
- },
2066
- {
2067
- "name": "tool_use",
2068
- "template": "{%- macro json_to_python_type(json_spec) %}\n{%- set basic_type_map = {\n \"string\": \"str\",\n \"number\": \"float\",\n \"integer\": \"int\",\n \"boolean\": \"bool\"\n} %}\n\n{%- if basic_type_map[json_spec.type] is defined %}\n {{- basic_type_map[json_spec.type] }}\n{%- elif json_spec.type == \"array\" %}\n {{- \"list[\" + json_to_python_type(json_spec|items) + \"]\"}}\n{%- elif json_spec.type == \"object\" %}\n {%- if json_spec.additionalProperties is defined %}\n {{- \"dict[str, \" + json_to_python_type(json_spec.additionalProperties) + ']'}}\n {%- else %}\n {{- \"dict\" }}\n {%- endif %}\n{%- elif json_spec.type is iterable %}\n {{- \"Union[\" }}\n {%- for t in json_spec.type %}\n {{- json_to_python_type({\"type\": t}) }}\n {%- if not loop.last %}\n {{- \",\" }} \n {%- endif %}\n {%- endfor %}\n {{- \"]\" }}\n{%- else %}\n {{- \"Any\" }}\n{%- endif %}\n{%- endmacro %}\n\n\n{{- bos_token }}\n{{- \"You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools: <tools> \" }}\n{%- for tool in tools %}\n {%- if tool.function is defined %}\n {%- set tool = tool.function %}\n {%- endif %}\n {{- '{\"type\": \"function\", \"function\": ' }}\n {{- '{\"name\": \"' + tool.name + '\", ' }}\n {{- '\"description\": \"' + tool.name + '(' }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {{- param_name + \": \" + json_to_python_type(param_fields) }}\n {%- if not loop.last %}\n {{- \", \" }}\n {%- endif %}\n {%- endfor %}\n {{- \")\" }}\n {%- if tool.return is defined %}\n {{- \" -> \" + json_to_python_type(tool.return) }}\n {%- endif %}\n {{- \" - \" + tool.description + \"\\n\\n\" }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {%- if loop.first %}\n {{- \" Args:\\n\" }}\n {%- endif %}\n {{- \" \" + param_name + \"(\" + json_to_python_type(param_fields) + \"): \" + param_fields.description|trim }}\n {%- endfor %}\n {%- if tool.return is defined and tool.return.description is defined %}\n {{- \"\\n Returns:\\n \" + tool.return.description }}\n {%- endif %}\n {{- '\"' }}\n {{- ', \"parameters\": ' }}\n {%- if tool.parameters.properties | length == 0 %}\n {{- \"{}\" }}\n {%- else %}\n {{- tool.parameters|tojson }}\n {%- endif %}\n {{- \"}\" }}\n {%- if not loop.last %}\n {{- \"\\n\" }}\n {%- endif %}\n{%- endfor %}\n{{- \" </tools>\" }}\n{{- 'Use the following pydantic model json schema for each tool call you will make: {\"properties\": {\"name\": {\"title\": \"Name\", \"type\": \"string\"}, \"arguments\": {\"title\": \"Arguments\", \"type\": \"object\"}}, \"required\": [\"name\", \"arguments\"], \"title\": \"FunctionCall\", \"type\": \"object\"}}\n' }}\n{{- \"For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:\n\" }}\n{{- \"<tool_call>\n\" }}\n{{- '{\"name\": <function-name>, \"arguments\": <args-dict>}\n' }}\n{{- '</tool_call><|im_end|>' }}\n{%- for message in messages %}\n {%- if message.role == \"user\" or message.role == \"system\" or (message.role == \"assistant\" and message.tool_calls is not defined) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- for tool_call in message.tool_calls %}\n {{- '\n<tool_call>\n' }} {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '{' }}\n {{- '\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\"}' }}\n {{- ', '}}\n {%- if tool_call.arguments is defined %}\n {{- '\"arguments\": ' }}\n {{- tool_call.arguments|tojson }}\n {%- endif %}\n {{- '\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if not message.name is defined %}\n {{- raise_exception(\"Tool response dicts require a 'name' key indicating the name of the called function!\") }}\n {%- endif %}\n {%- if loop.previtem and loop.previtem.role != \"tool\" %}\n {{- '<|im_start|>tool\\n' }}\n {%- endif %}\n {{- '<tool_response>\\n' }}\n {{- message.content }}\n {%- if not loop.last %}\n {{- '\\n</tool_response>\\n' }}\n {%- else %}\n {{- '\\n</tool_response>' }}\n {%- endif %}\n {%- if not loop.last and loop.nextitem.role != \"tool\" %}\n {{- '<|im_end|>' }}\n {%- elif loop.last %}\n {{- '<|im_end|>' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n"
2069
- }
2070
- ],
2071
  "clean_up_tokenization_spaces": true,
2072
  "eos_token": "<|im_end|>",
2073
  "model_input_names": [
 
2058
  }
2059
  },
2060
  "bos_token": "<|begin_of_text|>",
2061
+ "chat_template": "{%- macro json_to_python_type(json_spec) %}\n{%- set basic_type_map = {\n \"string\": \"str\",\n \"number\": \"float\",\n \"integer\": \"int\",\n \"boolean\": \"bool\"\n} %}\n\n{%- if basic_type_map[json_spec.type] is defined %}\n {{- basic_type_map[json_spec.type] }}\n{%- elif json_spec.type == \"array\" %}\n {{- \"list[\" + json_to_python_type(json_spec|items) + \"]\"}}\n{%- elif json_spec.type == \"object\" %}\n {%- if json_spec.additionalProperties is defined %}\n {{- \"dict[str, \" + json_to_python_type(json_spec.additionalProperties) + ']'}}\n {%- else %}\n {{- \"dict\" }}\n {%- endif %}\n{%- elif json_spec.type is iterable %}\n {{- \"Union[\" }}\n {%- for t in json_spec.type %}\n {{- json_to_python_type({\"type\": t}) }}\n {%- if not loop.last %}\n {{- \",\" }} \n {%- endif %}\n {%- endfor %}\n {{- \"]\" }}\n{%- else %}\n {{- \"Any\" }}\n{%- endif %}\n{%- endmacro %}\n\n\n{{- bos_token }}\n{#- This block defines tools and gives tool calling instructions if tools are present #}\n{%- if tools is defined and tools is not none %}\n {{- \"You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools: <tools> \" }}\n {%- for tool in tools %}\n {%- if tool.function is defined %}\n {%- set tool = tool.function %}\n {%- endif %}\n {{- '{\"type\": \"function\", \"function\": ' }}\n {{- '{\"name\": ' + tool.name + '\", ' }}\n {{- '\"description\": \"' + tool.name + '(' }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {{- param_name + \": \" + json_to_python_type(param_fields) }}\n {%- if not loop.last %}\n {{- \", \" }}\n {%- endif %}\n {%- endfor %}\n {{- \")\" }}\n {%- if tool.return is defined %}\n {{- \" -> \" + json_to_python_type(tool.return) }}\n {%- endif %}\n {{- \" - \" + tool.description + \"\\n\\n\" }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {%- if loop.first %}\n {{- \" Args:\\n\" }}\n {%- endif %}\n {{- \" \" + param_name + \"(\" + json_to_python_type(param_fields) + \"): \" + param_fields.description|trim }}\n {%- endfor %}\n {%- if tool.return is defined and tool.return.description is defined %}\n {{- \"\\n Returns:\\n \" + tool.return.description }}\n {%- endif %}\n {{- '\"' }}\n {{- ', \"parameters\": ' }}\n {%- if tool.parameters.properties | length == 0 %}\n {{- \"{}\" }}\n {%- else %}\n {{- tool.parameters|tojson }}\n {%- endif %}\n {{- \"}\" }}\n {%- if not loop.last %}\n {{- \"\\n\" }}\n {%- endif %}\n {%- endfor %}\n {{- \" </tools>\" }}\n {{- 'Use the following pydantic model json schema for each tool call you will make: {\"properties\": {\"arguments\": {\"title\": \"Arguments\", \"type\": \"object\"}, \"name\": {\"title\": \"Name\", \"type\": \"string\"}}, \"required\": [\"arguments\", \"name\"], \"title\": \"FunctionCall\", \"type\": \"object\"}\\n' }}\n {{- \"For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:\\n\" }}\n {{- \"<tool_call>\\n\" }}\n {{- '{\"arguments\": <args-dict>, \"name\": <function-name>}\\n' }}\n {{- '</tool_call><|im_end|>' }}\n{%- endif %}\n\n{%- for message in messages %}\n {%- if message.role == \"user\" or message.role == \"system\" or (message.role == \"assistant\" and message.tool_calls is not defined) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role + '\\n<tool_call>\\n' }}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '{ ' }}\n {%- if tool_call.arguments is defined %}\n {{- '\"arguments\": ' }}\n {{- tool_call.arguments|tojson }}\n {{- ', '}}\n {%- endif %}\n {{- '\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\"}' }}\n {{- '\\n</tool_call> ' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if not message.name is defined %}\n {{- raise_exception(\"Tool response dicts require a 'name' key indicating the name of the called function!\") }}\n {%- endif %}\n {{- '<|im_start|>' + message.role + '\\n<tool_response>\\n' }}\n {{- '{\"name\": \"' }}\n {{- message.name }}\n {{- '\", \"content\": ' }}\n {{- message.content|tojson + '}' }}\n {{- '\\n</tool_response> <|im_end|>\\n' }} \n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
 
 
 
 
 
 
 
 
 
2062
  "clean_up_tokenization_spaces": true,
2063
  "eos_token": "<|im_end|>",
2064
  "model_input_names": [