PromoGen_K562_GPT2_4096_tokens_2080Ti_x4_more_DE
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 10.7123
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 1024
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
7.8346 | 1.69 | 250 | 7.3201 |
6.8483 | 3.38 | 500 | 6.6602 |
6.5149 | 5.07 | 750 | 6.3857 |
6.1941 | 6.76 | 1000 | 6.1764 |
6.0188 | 8.45 | 1250 | 6.0308 |
5.7091 | 10.14 | 1500 | 5.9934 |
5.4459 | 11.82 | 1750 | 5.9371 |
5.1559 | 13.51 | 2000 | 5.9561 |
5.053 | 15.2 | 2250 | 6.0799 |
4.8356 | 16.89 | 2500 | 5.9637 |
4.6625 | 18.58 | 2750 | 6.0990 |
4.4349 | 20.27 | 3000 | 6.4003 |
4.2347 | 21.96 | 3250 | 6.2304 |
4.0581 | 23.65 | 3500 | 6.4391 |
4.1369 | 25.34 | 3750 | 6.6182 |
3.9847 | 27.03 | 4000 | 6.8092 |
3.7499 | 28.72 | 4250 | 6.7152 |
3.6274 | 30.41 | 4500 | 6.8939 |
3.515 | 32.09 | 4750 | 7.1118 |
3.428 | 33.78 | 5000 | 6.8671 |
3.4504 | 35.47 | 5250 | 7.0974 |
3.3346 | 37.16 | 5500 | 7.4627 |
3.1574 | 38.85 | 5750 | 7.1213 |
3.0594 | 40.54 | 6000 | 7.3364 |
3.1296 | 42.23 | 6250 | 7.5662 |
3.0683 | 43.92 | 6500 | 7.3176 |
2.9588 | 45.61 | 6750 | 7.4911 |
2.8816 | 47.3 | 7000 | 7.7823 |
2.7364 | 48.99 | 7250 | 7.4892 |
2.6647 | 50.68 | 7500 | 7.7526 |
2.8314 | 52.36 | 7750 | 7.9454 |
2.7594 | 54.05 | 8000 | 8.2020 |
2.5894 | 55.74 | 8250 | 7.8243 |
2.5204 | 57.43 | 8500 | 8.0420 |
2.5072 | 59.12 | 8750 | 8.2496 |
2.4765 | 60.81 | 9000 | 7.9321 |
2.5039 | 62.5 | 9250 | 8.1675 |
2.4301 | 64.19 | 9500 | 8.4031 |
2.2991 | 65.88 | 9750 | 8.0673 |
2.2471 | 67.57 | 10000 | 8.2341 |
2.3514 | 69.26 | 10250 | 8.5114 |
2.3166 | 70.95 | 10500 | 8.1837 |
2.2225 | 72.63 | 10750 | 8.3777 |
2.17 | 74.32 | 11000 | 8.5542 |
2.0509 | 76.01 | 11250 | 8.6312 |
2.0204 | 77.7 | 11500 | 8.4680 |
2.1625 | 79.39 | 11750 | 8.7197 |
2.1102 | 81.08 | 12000 | 8.8730 |
1.9783 | 82.77 | 12250 | 8.6112 |
1.9293 | 84.46 | 12500 | 8.7798 |
1.9402 | 86.15 | 12750 | 8.9528 |
1.9095 | 87.84 | 13000 | 8.6741 |
1.9188 | 89.53 | 13250 | 8.9483 |
1.877 | 91.22 | 13500 | 9.0630 |
1.7614 | 92.9 | 13750 | 8.8313 |
1.7251 | 94.59 | 14000 | 8.9837 |
1.821 | 96.28 | 14250 | 9.1340 |
1.799 | 97.97 | 14500 | 8.9597 |
1.7024 | 99.66 | 14750 | 9.1133 |
1.6657 | 101.35 | 15000 | 9.2959 |
1.5863 | 103.04 | 15250 | 9.2909 |
1.5708 | 104.73 | 15500 | 9.1864 |
1.6538 | 106.42 | 15750 | 9.3570 |
1.6119 | 108.11 | 16000 | 9.4937 |
1.5147 | 109.8 | 16250 | 9.3262 |
1.4745 | 111.49 | 16500 | 9.4693 |
1.4869 | 113.18 | 16750 | 9.5963 |
1.4664 | 114.86 | 17000 | 9.4436 |
1.4566 | 116.55 | 17250 | 9.5559 |
1.4231 | 118.24 | 17500 | 9.6734 |
1.3421 | 119.93 | 17750 | 9.5746 |
1.3086 | 121.62 | 18000 | 9.7034 |
1.3773 | 123.31 | 18250 | 9.7930 |
1.3537 | 125.0 | 18500 | 9.6953 |
1.2834 | 126.69 | 18750 | 9.8154 |
1.2516 | 128.38 | 19000 | 9.8966 |
1.2079 | 130.07 | 19250 | 9.9296 |
1.1875 | 131.76 | 19500 | 9.9139 |
1.2227 | 133.45 | 19750 | 10.0024 |
1.1899 | 135.14 | 20000 | 10.0485 |
1.1332 | 136.82 | 20250 | 10.0329 |
1.107 | 138.51 | 20500 | 10.1044 |
1.1128 | 140.2 | 20750 | 10.1534 |
1.0945 | 141.89 | 21000 | 10.1375 |
1.0762 | 143.58 | 21250 | 10.2039 |
1.0495 | 145.27 | 21500 | 10.2378 |
1.0105 | 146.96 | 21750 | 10.2377 |
0.9875 | 148.65 | 22000 | 10.3128 |
1.0177 | 150.34 | 22250 | 10.3502 |
0.9956 | 152.03 | 22500 | 10.3521 |
0.9628 | 153.72 | 22750 | 10.3924 |
0.9403 | 155.41 | 23000 | 10.4238 |
0.9214 | 157.09 | 23250 | 10.4514 |
0.9078 | 158.78 | 23500 | 10.4660 |
0.9114 | 160.47 | 23750 | 10.4969 |
0.8935 | 162.16 | 24000 | 10.5225 |
0.8711 | 163.85 | 24250 | 10.5399 |
0.8549 | 165.54 | 24500 | 10.5598 |
0.8564 | 167.23 | 24750 | 10.5896 |
0.8441 | 168.92 | 25000 | 10.5900 |
0.8357 | 170.61 | 25250 | 10.6085 |
0.8229 | 172.3 | 25500 | 10.6296 |
0.8094 | 173.99 | 25750 | 10.6426 |
0.7992 | 175.68 | 26000 | 10.6518 |
0.8038 | 177.36 | 26250 | 10.6652 |
0.7948 | 179.05 | 26500 | 10.6742 |
0.7852 | 180.74 | 26750 | 10.6780 |
0.7775 | 182.43 | 27000 | 10.6912 |
0.7737 | 184.12 | 27250 | 10.6925 |
0.769 | 185.81 | 27500 | 10.6993 |
0.7679 | 187.5 | 27750 | 10.7020 |
0.7636 | 189.19 | 28000 | 10.7062 |
0.76 | 190.88 | 28250 | 10.7085 |
0.7595 | 192.57 | 28500 | 10.7107 |
0.7583 | 194.26 | 28750 | 10.7120 |
0.7552 | 195.95 | 29000 | 10.7126 |
0.7568 | 197.63 | 29250 | 10.7120 |
0.7556 | 199.32 | 29500 | 10.7123 |
Framework versions
- Transformers 4.24.0
- Pytorch 1.13.0
- Datasets 2.7.0
- Tokenizers 0.13.0.dev0
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.