Llama-3.2-1B
Collection
8 items
•
Updated
•
1
This model is a fine-tuned version of unsloth/meta-llama-3.1-8b-instruct-bnb-4bit on the None dataset.
This model was trained on Successful episodes of the top 1 model similar to D20003 but instead of using the whole episode as input, each episode was split into conversation pieces.
e.g.
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
]
is split int:
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
and
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
]
After splitting, the dataset contains about 2908 conversation bits accross all games.
The Dataset ID is D30004
The following hyperparameters were used during training: