t5-base-p-l-akk-en-20240922-080244
This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.8507
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3.152142797506865e-05
- train_batch_size: 12
- eval_batch_size: 12
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.9444 | 1.1384 | 2500 | 0.8638 |
0.8431 | 2.2769 | 5000 | 0.8085 |
0.7912 | 3.4153 | 7500 | 0.7750 |
0.7434 | 4.5537 | 10000 | 0.7531 |
0.7171 | 5.6922 | 12500 | 0.7395 |
0.692 | 6.8306 | 15000 | 0.7278 |
0.6596 | 7.9690 | 17500 | 0.7165 |
0.6155 | 9.1075 | 20000 | 0.7231 |
0.61 | 10.2459 | 22500 | 0.7129 |
0.5886 | 11.3843 | 25000 | 0.7068 |
0.5718 | 12.5228 | 27500 | 0.7084 |
0.5519 | 13.6612 | 30000 | 0.7029 |
0.5412 | 14.7996 | 32500 | 0.7007 |
0.5241 | 15.9381 | 35000 | 0.7017 |
0.5026 | 17.0765 | 37500 | 0.7134 |
0.4733 | 18.2149 | 40000 | 0.7038 |
0.489 | 19.3534 | 42500 | 0.7067 |
0.4666 | 20.4918 | 45000 | 0.7083 |
0.4494 | 21.6302 | 47500 | 0.7061 |
0.4545 | 22.7687 | 50000 | 0.7092 |
0.4357 | 23.9071 | 52500 | 0.7116 |
0.4332 | 25.0455 | 55000 | 0.7189 |
0.4152 | 26.1840 | 57500 | 0.7207 |
0.3995 | 27.3224 | 60000 | 0.7196 |
0.3976 | 28.4608 | 62500 | 0.7184 |
0.3879 | 29.5993 | 65000 | 0.7210 |
0.3812 | 30.7377 | 67500 | 0.7243 |
0.3749 | 31.8761 | 70000 | 0.7241 |
0.3663 | 33.0146 | 72500 | 0.7320 |
0.3612 | 34.1530 | 75000 | 0.7344 |
0.3469 | 35.2914 | 77500 | 0.7377 |
0.3407 | 36.4299 | 80000 | 0.7388 |
0.3309 | 37.5683 | 82500 | 0.7411 |
0.3354 | 38.7067 | 85000 | 0.7354 |
0.3252 | 39.8452 | 87500 | 0.7407 |
0.3167 | 40.9836 | 90000 | 0.7435 |
0.3182 | 42.1220 | 92500 | 0.7502 |
0.2994 | 43.2605 | 95000 | 0.7547 |
0.3064 | 44.3989 | 97500 | 0.7561 |
0.2923 | 45.5373 | 100000 | 0.7529 |
0.2848 | 46.6758 | 102500 | 0.7593 |
0.2843 | 47.8142 | 105000 | 0.7600 |
0.279 | 48.9526 | 107500 | 0.7650 |
0.2781 | 50.0911 | 110000 | 0.7706 |
0.2629 | 51.2295 | 112500 | 0.7730 |
0.2639 | 52.3679 | 115000 | 0.7726 |
0.2624 | 53.5064 | 117500 | 0.7791 |
0.2547 | 54.6448 | 120000 | 0.7776 |
0.2567 | 55.7832 | 122500 | 0.7747 |
0.2484 | 56.9217 | 125000 | 0.7792 |
0.2454 | 58.0601 | 127500 | 0.7893 |
0.2398 | 59.1985 | 130000 | 0.7864 |
0.2313 | 60.3370 | 132500 | 0.7973 |
0.2362 | 61.4754 | 135000 | 0.7964 |
0.2359 | 62.6138 | 137500 | 0.7962 |
0.226 | 63.7523 | 140000 | 0.8009 |
0.2271 | 64.8907 | 142500 | 0.8027 |
0.2249 | 66.0291 | 145000 | 0.8014 |
0.2212 | 67.1676 | 147500 | 0.8077 |
0.2129 | 68.3060 | 150000 | 0.8088 |
0.2131 | 69.4444 | 152500 | 0.8108 |
0.2106 | 70.5829 | 155000 | 0.8144 |
0.2078 | 71.7213 | 157500 | 0.8163 |
0.2103 | 72.8597 | 160000 | 0.8148 |
0.2025 | 73.9982 | 162500 | 0.8215 |
0.2023 | 75.1366 | 165000 | 0.8250 |
0.197 | 76.2750 | 167500 | 0.8267 |
0.1945 | 77.4135 | 170000 | 0.8274 |
0.1919 | 78.5519 | 172500 | 0.8289 |
0.187 | 79.6903 | 175000 | 0.8308 |
0.1948 | 80.8288 | 177500 | 0.8339 |
0.1857 | 81.9672 | 180000 | 0.8346 |
0.191 | 83.1056 | 182500 | 0.8380 |
0.1796 | 84.2441 | 185000 | 0.8387 |
0.1862 | 85.3825 | 187500 | 0.8414 |
0.185 | 86.5209 | 190000 | 0.8409 |
0.1778 | 87.6594 | 192500 | 0.8434 |
0.1824 | 88.7978 | 195000 | 0.8426 |
0.1735 | 89.9362 | 197500 | 0.8443 |
0.1737 | 91.0747 | 200000 | 0.8474 |
0.1787 | 92.2131 | 202500 | 0.8462 |
0.1759 | 93.3515 | 205000 | 0.8484 |
0.1744 | 94.4900 | 207500 | 0.8487 |
0.1778 | 95.6284 | 210000 | 0.8502 |
0.1767 | 96.7668 | 212500 | 0.8507 |
0.175 | 97.9053 | 215000 | 0.8499 |
0.1723 | 99.0437 | 217500 | 0.8507 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 6