metadata
base_model: nreimers/MiniLMv2-L6-H384-distilled-from-RoBERTa-Large
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
model-index:
- name: MiniLMv2-L6-H384-distilled-from-RoBERTa-Large-agentflow-distil
results: []
MiniLMv2-L6-H384-distilled-from-RoBERTa-Large-agentflow-distil
This model is a fine-tuned version of nreimers/MiniLMv2-L6-H384-distilled-from-RoBERTa-Large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1540
- Accuracy: 0.9616
- F1: 0.9618
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
No log | 0.07 | 30 | 3.4249 | 0.1510 | 0.0404 |
No log | 0.13 | 60 | 3.3994 | 0.2779 | 0.1759 |
No log | 0.2 | 90 | 3.3313 | 0.3423 | 0.2154 |
No log | 0.27 | 120 | 3.1475 | 0.3977 | 0.3024 |
No log | 0.33 | 150 | 2.8961 | 0.3494 | 0.2370 |
No log | 0.4 | 180 | 2.6867 | 0.5147 | 0.4325 |
No log | 0.47 | 210 | 2.4676 | 0.5728 | 0.4955 |
No log | 0.54 | 240 | 2.2129 | 0.5657 | 0.4588 |
No log | 0.6 | 270 | 1.9712 | 0.6917 | 0.6331 |
No log | 0.67 | 300 | 1.8016 | 0.6533 | 0.5799 |
No log | 0.74 | 330 | 1.5721 | 0.7185 | 0.6524 |
No log | 0.8 | 360 | 1.3381 | 0.8061 | 0.7760 |
No log | 0.87 | 390 | 1.1876 | 0.8543 | 0.8319 |
No log | 0.94 | 420 | 0.9877 | 0.8722 | 0.8577 |
No log | 1.0 | 450 | 0.8819 | 0.8892 | 0.8850 |
No log | 1.07 | 480 | 0.7511 | 0.8972 | 0.8955 |
2.2047 | 1.14 | 510 | 0.5262 | 0.9410 | 0.9408 |
2.2047 | 1.21 | 540 | 0.5107 | 0.9294 | 0.9297 |
2.2047 | 1.27 | 570 | 0.4612 | 0.9285 | 0.9292 |
2.2047 | 1.34 | 600 | 0.3487 | 0.9410 | 0.9407 |
2.2047 | 1.41 | 630 | 0.3137 | 0.9374 | 0.9369 |
2.2047 | 1.47 | 660 | 0.2951 | 0.9223 | 0.9190 |
2.2047 | 1.54 | 690 | 0.2738 | 0.9374 | 0.9377 |
2.2047 | 1.61 | 720 | 0.2472 | 0.9446 | 0.9439 |
2.2047 | 1.67 | 750 | 0.1988 | 0.9535 | 0.9530 |
2.2047 | 1.74 | 780 | 0.2016 | 0.9517 | 0.9519 |
2.2047 | 1.81 | 810 | 0.2158 | 0.9428 | 0.9427 |
2.2047 | 1.88 | 840 | 0.2519 | 0.9330 | 0.9324 |
2.2047 | 1.94 | 870 | 0.2224 | 0.9437 | 0.9436 |
2.2047 | 2.01 | 900 | 0.3032 | 0.9285 | 0.9276 |
2.2047 | 2.08 | 930 | 0.1815 | 0.9544 | 0.9546 |
2.2047 | 2.14 | 960 | 0.2125 | 0.9455 | 0.9455 |
2.2047 | 2.21 | 990 | 0.2198 | 0.9455 | 0.9446 |
0.2888 | 2.28 | 1020 | 0.1869 | 0.9571 | 0.9568 |
0.2888 | 2.34 | 1050 | 0.1705 | 0.9571 | 0.9568 |
0.2888 | 2.41 | 1080 | 0.1927 | 0.9526 | 0.9523 |
0.2888 | 2.48 | 1110 | 0.1700 | 0.9562 | 0.9561 |
0.2888 | 2.54 | 1140 | 0.2162 | 0.9464 | 0.9460 |
0.2888 | 2.61 | 1170 | 0.1540 | 0.9616 | 0.9618 |
0.2888 | 2.68 | 1200 | 0.1752 | 0.9562 | 0.9561 |
0.2888 | 2.75 | 1230 | 0.1476 | 0.9607 | 0.9605 |
0.2888 | 2.81 | 1260 | 0.2575 | 0.9410 | 0.9414 |
0.2888 | 2.88 | 1290 | 0.1574 | 0.9616 | 0.9614 |
0.2888 | 2.95 | 1320 | 0.1574 | 0.9598 | 0.9596 |
0.2888 | 3.01 | 1350 | 0.1640 | 0.9580 | 0.9578 |
0.2888 | 3.08 | 1380 | 0.1627 | 0.9598 | 0.9594 |
0.2888 | 3.15 | 1410 | 0.1866 | 0.9544 | 0.9550 |
0.2888 | 3.21 | 1440 | 0.1610 | 0.9526 | 0.9526 |
0.2888 | 3.28 | 1470 | 0.2134 | 0.9419 | 0.9412 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.1.2
- Datasets 2.1.0
- Tokenizers 0.15.1