Finetune NousResearch/Hermes-2-Pro-Llama-3-8B dataset

#27
by Akhilraju - opened

I want to create a fine-tune dataset that can output json. After looking into your jsonmode.py,
should the dataset be
Option 1:

{
Promt:
"[INST] <<SYS>>
You are a helpful assistant that answers in JSON. Here's the JSON schema you must adhere to:
<schema>
{pydantic_schema}
</schema>
<</SYS>>
The BMW has 2 doors at the back and a bonet. The Policy Number is 2436
[/INST]"

completion:"{
  "Car_model": BMW,
  "PolicyNumber": 2436,
  "Bonet": True,
}"
}

Option 2:

{
  "prompt": "System: You are a helpful assistant that answers in JSON. Here's the json schema you must adhere to:\n
<schema>
{schema}
</schema>

\n\n
User: The BMW has 2 doors at the back and a bonet. The Policy Number is 2436
\n\n

Assistant:",

  "completion": "{"Car_model": BMW,  "PolicyNumber": 2436,  "Bonet": True,}"
}

Which is the suitable way to create datase to finetune the model for the json output?

Option 2 looks close but not quite right.

However, NousResearch already have a dataset you could either use on it's own or as a template to construct your own: hermes-function-calling-v1.

For json specifically you want one of the json mode subsets

Sign up or log in to comment