Edit model card

PromoGen_K562_GPT2_4096_tokens_2080Ti_x4_more_DE

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 10.7123

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 1024
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
7.8346 1.69 250 7.3201
6.8483 3.38 500 6.6602
6.5149 5.07 750 6.3857
6.1941 6.76 1000 6.1764
6.0188 8.45 1250 6.0308
5.7091 10.14 1500 5.9934
5.4459 11.82 1750 5.9371
5.1559 13.51 2000 5.9561
5.053 15.2 2250 6.0799
4.8356 16.89 2500 5.9637
4.6625 18.58 2750 6.0990
4.4349 20.27 3000 6.4003
4.2347 21.96 3250 6.2304
4.0581 23.65 3500 6.4391
4.1369 25.34 3750 6.6182
3.9847 27.03 4000 6.8092
3.7499 28.72 4250 6.7152
3.6274 30.41 4500 6.8939
3.515 32.09 4750 7.1118
3.428 33.78 5000 6.8671
3.4504 35.47 5250 7.0974
3.3346 37.16 5500 7.4627
3.1574 38.85 5750 7.1213
3.0594 40.54 6000 7.3364
3.1296 42.23 6250 7.5662
3.0683 43.92 6500 7.3176
2.9588 45.61 6750 7.4911
2.8816 47.3 7000 7.7823
2.7364 48.99 7250 7.4892
2.6647 50.68 7500 7.7526
2.8314 52.36 7750 7.9454
2.7594 54.05 8000 8.2020
2.5894 55.74 8250 7.8243
2.5204 57.43 8500 8.0420
2.5072 59.12 8750 8.2496
2.4765 60.81 9000 7.9321
2.5039 62.5 9250 8.1675
2.4301 64.19 9500 8.4031
2.2991 65.88 9750 8.0673
2.2471 67.57 10000 8.2341
2.3514 69.26 10250 8.5114
2.3166 70.95 10500 8.1837
2.2225 72.63 10750 8.3777
2.17 74.32 11000 8.5542
2.0509 76.01 11250 8.6312
2.0204 77.7 11500 8.4680
2.1625 79.39 11750 8.7197
2.1102 81.08 12000 8.8730
1.9783 82.77 12250 8.6112
1.9293 84.46 12500 8.7798
1.9402 86.15 12750 8.9528
1.9095 87.84 13000 8.6741
1.9188 89.53 13250 8.9483
1.877 91.22 13500 9.0630
1.7614 92.9 13750 8.8313
1.7251 94.59 14000 8.9837
1.821 96.28 14250 9.1340
1.799 97.97 14500 8.9597
1.7024 99.66 14750 9.1133
1.6657 101.35 15000 9.2959
1.5863 103.04 15250 9.2909
1.5708 104.73 15500 9.1864
1.6538 106.42 15750 9.3570
1.6119 108.11 16000 9.4937
1.5147 109.8 16250 9.3262
1.4745 111.49 16500 9.4693
1.4869 113.18 16750 9.5963
1.4664 114.86 17000 9.4436
1.4566 116.55 17250 9.5559
1.4231 118.24 17500 9.6734
1.3421 119.93 17750 9.5746
1.3086 121.62 18000 9.7034
1.3773 123.31 18250 9.7930
1.3537 125.0 18500 9.6953
1.2834 126.69 18750 9.8154
1.2516 128.38 19000 9.8966
1.2079 130.07 19250 9.9296
1.1875 131.76 19500 9.9139
1.2227 133.45 19750 10.0024
1.1899 135.14 20000 10.0485
1.1332 136.82 20250 10.0329
1.107 138.51 20500 10.1044
1.1128 140.2 20750 10.1534
1.0945 141.89 21000 10.1375
1.0762 143.58 21250 10.2039
1.0495 145.27 21500 10.2378
1.0105 146.96 21750 10.2377
0.9875 148.65 22000 10.3128
1.0177 150.34 22250 10.3502
0.9956 152.03 22500 10.3521
0.9628 153.72 22750 10.3924
0.9403 155.41 23000 10.4238
0.9214 157.09 23250 10.4514
0.9078 158.78 23500 10.4660
0.9114 160.47 23750 10.4969
0.8935 162.16 24000 10.5225
0.8711 163.85 24250 10.5399
0.8549 165.54 24500 10.5598
0.8564 167.23 24750 10.5896
0.8441 168.92 25000 10.5900
0.8357 170.61 25250 10.6085
0.8229 172.3 25500 10.6296
0.8094 173.99 25750 10.6426
0.7992 175.68 26000 10.6518
0.8038 177.36 26250 10.6652
0.7948 179.05 26500 10.6742
0.7852 180.74 26750 10.6780
0.7775 182.43 27000 10.6912
0.7737 184.12 27250 10.6925
0.769 185.81 27500 10.6993
0.7679 187.5 27750 10.7020
0.7636 189.19 28000 10.7062
0.76 190.88 28250 10.7085
0.7595 192.57 28500 10.7107
0.7583 194.26 28750 10.7120
0.7552 195.95 29000 10.7126
0.7568 197.63 29250 10.7120
0.7556 199.32 29500 10.7123

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.13.0
  • Datasets 2.7.0
  • Tokenizers 0.13.0.dev0
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.