Finetune NousResearch/Hermes-2-Pro-Llama-3-8B dataset
#27
by
Akhilraju
- opened
I want to create a fine-tune dataset that can output json. After looking into your jsonmode.py
,
should the dataset be
Option 1:
{
Promt:
"[INST] <<SYS>>
You are a helpful assistant that answers in JSON. Here's the JSON schema you must adhere to:
<schema>
{pydantic_schema}
</schema>
<</SYS>>
The BMW has 2 doors at the back and a bonet. The Policy Number is 2436
[/INST]"
completion:"{
"Car_model": BMW,
"PolicyNumber": 2436,
"Bonet": True,
}"
}
Option 2:
{
"prompt": "System: You are a helpful assistant that answers in JSON. Here's the json schema you must adhere to:\n
<schema>
{schema}
</schema>
\n\n
User: The BMW has 2 doors at the back and a bonet. The Policy Number is 2436
\n\n
Assistant:",
"completion": "{"Car_model": BMW, "PolicyNumber": 2436, "Bonet": True,}"
}
Which is the suitable way to create datase to finetune the model for the json output?
Option 2 looks close but not quite right.
However, NousResearch already have a dataset you could either use on it's own or as a template to construct your own: hermes-function-calling-v1.
For json specifically you want one of the json mode subsets