gpt2english98
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.9145
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.4337 | 0.0102 | 10000 | 6.3333 |
6.163 | 0.0204 | 20000 | 6.0858 |
6.0254 | 0.0306 | 30000 | 5.9577 |
5.9641 | 0.0408 | 40000 | 5.8612 |
5.9122 | 0.0510 | 50000 | 5.7882 |
5.7888 | 0.0612 | 60000 | 5.7222 |
5.7087 | 0.0714 | 70000 | 5.6873 |
5.7093 | 0.0816 | 80000 | 5.6355 |
5.6654 | 0.0918 | 90000 | 5.5949 |
5.6532 | 0.1020 | 100000 | 5.5734 |
5.6147 | 0.1122 | 110000 | 5.5327 |
5.6729 | 0.1224 | 120000 | 5.5168 |
5.5226 | 0.1327 | 130000 | 5.4808 |
5.4966 | 0.1429 | 140000 | 5.4611 |
5.4708 | 0.1531 | 150000 | 5.4348 |
5.4733 | 0.1633 | 160000 | 5.4144 |
5.4892 | 0.1735 | 170000 | 5.3892 |
5.5325 | 0.1837 | 180000 | 5.3728 |
5.4798 | 0.1939 | 190000 | 5.3587 |
5.4556 | 0.2041 | 200000 | 5.3404 |
5.3283 | 0.2143 | 210000 | 5.3255 |
5.3728 | 0.2245 | 220000 | 5.3096 |
5.42 | 0.2347 | 230000 | 5.3045 |
5.4695 | 0.2449 | 240000 | 5.2806 |
5.3529 | 0.2551 | 250000 | 5.2689 |
5.3567 | 0.2653 | 260000 | 5.2556 |
5.2927 | 0.2755 | 270000 | 5.2452 |
5.3838 | 0.2857 | 280000 | 5.2276 |
5.3734 | 0.2959 | 290000 | 5.2212 |
5.3353 | 0.3061 | 300000 | 5.2074 |
5.392 | 0.3163 | 310000 | 5.2042 |
5.3286 | 0.3265 | 320000 | 5.2012 |
5.3774 | 0.3367 | 330000 | 5.1871 |
5.2164 | 0.3469 | 340000 | 5.1758 |
5.3587 | 0.3571 | 350000 | 5.1685 |
5.3574 | 0.3673 | 360000 | 5.1575 |
5.2087 | 0.3776 | 370000 | 5.1482 |
5.1612 | 0.3878 | 380000 | 5.1489 |
5.225 | 0.3980 | 390000 | 5.1398 |
5.241 | 0.4082 | 400000 | 5.1222 |
5.2267 | 0.4184 | 410000 | 5.1288 |
5.1924 | 0.4286 | 420000 | 5.1110 |
5.2413 | 0.4388 | 430000 | 5.1047 |
5.2015 | 0.4490 | 440000 | 5.1049 |
5.2847 | 0.4592 | 450000 | 5.0944 |
5.1406 | 0.4694 | 460000 | 5.0888 |
5.1992 | 0.4796 | 470000 | 5.0786 |
5.0754 | 0.4898 | 480000 | 5.0810 |
5.1644 | 0.5 | 490000 | 5.0697 |
5.1464 | 0.5102 | 500000 | 5.0609 |
5.1771 | 0.5204 | 510000 | 5.0560 |
5.1896 | 0.5306 | 520000 | 5.0574 |
5.1355 | 0.5408 | 530000 | 5.0498 |
5.115 | 0.5510 | 540000 | 5.0494 |
5.1575 | 0.5612 | 550000 | 5.0357 |
5.191 | 0.5714 | 560000 | 5.0305 |
5.1694 | 0.5816 | 570000 | 5.0303 |
5.1591 | 0.5918 | 580000 | 5.0267 |
5.138 | 0.6020 | 590000 | 5.0264 |
5.0825 | 0.6122 | 600000 | 5.0195 |
5.1669 | 0.6224 | 610000 | 5.0147 |
5.0309 | 0.6327 | 620000 | 5.0156 |
5.0886 | 0.6429 | 630000 | 5.0077 |
5.1049 | 0.6531 | 640000 | 5.0021 |
5.1385 | 0.6633 | 650000 | 5.0052 |
5.1294 | 0.6735 | 660000 | 4.9955 |
5.0726 | 0.6837 | 670000 | 4.9947 |
5.1084 | 0.6939 | 680000 | 4.9912 |
5.0205 | 0.7041 | 690000 | 4.9869 |
5.111 | 0.7143 | 700000 | 4.9826 |
5.0809 | 0.7245 | 710000 | 4.9773 |
5.1221 | 0.7347 | 720000 | 4.9775 |
5.1516 | 0.7449 | 730000 | 4.9721 |
5.1347 | 0.7551 | 740000 | 4.9655 |
5.0744 | 0.7653 | 750000 | 4.9664 |
5.0715 | 0.7755 | 760000 | 4.9626 |
5.1118 | 0.7857 | 770000 | 4.9592 |
5.0933 | 0.7959 | 780000 | 4.9558 |
5.0685 | 0.8061 | 790000 | 4.9543 |
5.1237 | 0.8163 | 800000 | 4.9514 |
4.9532 | 0.8265 | 810000 | 4.9493 |
5.0854 | 0.8367 | 820000 | 4.9478 |
5.0865 | 0.8469 | 830000 | 4.9417 |
5.085 | 0.8571 | 840000 | 4.9419 |
5.0835 | 0.8673 | 850000 | 4.9385 |
5.0347 | 0.8776 | 860000 | 4.9345 |
4.9784 | 0.8878 | 870000 | 4.9332 |
5.0046 | 0.8980 | 880000 | 4.9317 |
4.9069 | 0.9082 | 890000 | 4.9296 |
5.0209 | 0.9184 | 900000 | 4.9270 |
5.1551 | 0.9286 | 910000 | 4.9234 |
5.1849 | 0.9388 | 920000 | 4.9230 |
5.07 | 0.9490 | 930000 | 4.9200 |
4.9804 | 0.9592 | 940000 | 4.9195 |
5.0419 | 0.9694 | 950000 | 4.9174 |
5.0447 | 0.9796 | 960000 | 4.9165 |
5.0839 | 0.9898 | 970000 | 4.9161 |
4.9989 | 1.0 | 980000 | 4.9145 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.