slm-experiments
Collection
15 items
•
Updated
This model is a fine-tuned version of HuggingFaceTB/SmolLM-360M-Instruct on the generator dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.8889 | 2 | 2.0708 |
No log | 1.7778 | 4 | 2.0152 |
No log | 2.6667 | 6 | 1.9361 |
No log | 4.0 | 9 | 1.8851 |
1.9803 | 4.8889 | 11 | 1.8728 |
1.9803 | 5.7778 | 13 | 1.8640 |
1.9803 | 6.6667 | 15 | 1.8571 |
1.9803 | 8.0 | 18 | 1.8525 |
1.8574 | 8.8889 | 20 | 1.8521 |
Base model
HuggingFaceTB/SmolLM-360M