PromoGen_K562_GPT2_4096_tokens_2080Ti_x4_more_DE

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 10.7123

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 128
eval_batch_size: 128
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 1024
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
num_epochs: 200

Training results

Training Loss	Epoch	Step	Validation Loss
7.8346	1.69	250	7.3201
6.8483	3.38	500	6.6602
6.5149	5.07	750	6.3857
6.1941	6.76	1000	6.1764
6.0188	8.45	1250	6.0308
5.7091	10.14	1500	5.9934
5.4459	11.82	1750	5.9371
5.1559	13.51	2000	5.9561
5.053	15.2	2250	6.0799
4.8356	16.89	2500	5.9637
4.6625	18.58	2750	6.0990
4.4349	20.27	3000	6.4003
4.2347	21.96	3250	6.2304
4.0581	23.65	3500	6.4391
4.1369	25.34	3750	6.6182
3.9847	27.03	4000	6.8092
3.7499	28.72	4250	6.7152
3.6274	30.41	4500	6.8939
3.515	32.09	4750	7.1118
3.428	33.78	5000	6.8671
3.4504	35.47	5250	7.0974
3.3346	37.16	5500	7.4627
3.1574	38.85	5750	7.1213
3.0594	40.54	6000	7.3364
3.1296	42.23	6250	7.5662
3.0683	43.92	6500	7.3176
2.9588	45.61	6750	7.4911
2.8816	47.3	7000	7.7823
2.7364	48.99	7250	7.4892
2.6647	50.68	7500	7.7526
2.8314	52.36	7750	7.9454
2.7594	54.05	8000	8.2020
2.5894	55.74	8250	7.8243
2.5204	57.43	8500	8.0420
2.5072	59.12	8750	8.2496
2.4765	60.81	9000	7.9321
2.5039	62.5	9250	8.1675
2.4301	64.19	9500	8.4031
2.2991	65.88	9750	8.0673
2.2471	67.57	10000	8.2341
2.3514	69.26	10250	8.5114
2.3166	70.95	10500	8.1837
2.2225	72.63	10750	8.3777
2.17	74.32	11000	8.5542
2.0509	76.01	11250	8.6312
2.0204	77.7	11500	8.4680
2.1625	79.39	11750	8.7197
2.1102	81.08	12000	8.8730
1.9783	82.77	12250	8.6112
1.9293	84.46	12500	8.7798
1.9402	86.15	12750	8.9528
1.9095	87.84	13000	8.6741
1.9188	89.53	13250	8.9483
1.877	91.22	13500	9.0630
1.7614	92.9	13750	8.8313
1.7251	94.59	14000	8.9837
1.821	96.28	14250	9.1340
1.799	97.97	14500	8.9597
1.7024	99.66	14750	9.1133
1.6657	101.35	15000	9.2959
1.5863	103.04	15250	9.2909
1.5708	104.73	15500	9.1864
1.6538	106.42	15750	9.3570
1.6119	108.11	16000	9.4937
1.5147	109.8	16250	9.3262
1.4745	111.49	16500	9.4693
1.4869	113.18	16750	9.5963
1.4664	114.86	17000	9.4436
1.4566	116.55	17250	9.5559
1.4231	118.24	17500	9.6734
1.3421	119.93	17750	9.5746
1.3086	121.62	18000	9.7034
1.3773	123.31	18250	9.7930
1.3537	125.0	18500	9.6953
1.2834	126.69	18750	9.8154
1.2516	128.38	19000	9.8966
1.2079	130.07	19250	9.9296
1.1875	131.76	19500	9.9139
1.2227	133.45	19750	10.0024
1.1899	135.14	20000	10.0485
1.1332	136.82	20250	10.0329
1.107	138.51	20500	10.1044
1.1128	140.2	20750	10.1534
1.0945	141.89	21000	10.1375
1.0762	143.58	21250	10.2039
1.0495	145.27	21500	10.2378
1.0105	146.96	21750	10.2377
0.9875	148.65	22000	10.3128
1.0177	150.34	22250	10.3502
0.9956	152.03	22500	10.3521
0.9628	153.72	22750	10.3924
0.9403	155.41	23000	10.4238
0.9214	157.09	23250	10.4514
0.9078	158.78	23500	10.4660
0.9114	160.47	23750	10.4969
0.8935	162.16	24000	10.5225
0.8711	163.85	24250	10.5399
0.8549	165.54	24500	10.5598
0.8564	167.23	24750	10.5896
0.8441	168.92	25000	10.5900
0.8357	170.61	25250	10.6085
0.8229	172.3	25500	10.6296
0.8094	173.99	25750	10.6426
0.7992	175.68	26000	10.6518
0.8038	177.36	26250	10.6652
0.7948	179.05	26500	10.6742
0.7852	180.74	26750	10.6780
0.7775	182.43	27000	10.6912
0.7737	184.12	27250	10.6925
0.769	185.81	27500	10.6993
0.7679	187.5	27750	10.7020
0.7636	189.19	28000	10.7062
0.76	190.88	28250	10.7085
0.7595	192.57	28500	10.7107
0.7583	194.26	28750	10.7120
0.7552	195.95	29000	10.7126
0.7568	197.63	29250	10.7120
0.7556	199.32	29500	10.7123

Framework versions

Transformers 4.24.0
Pytorch 1.13.0
Datasets 2.7.0
Tokenizers 0.13.0.dev0

anikethjr
/

PromoGen_K562_GPT2_4096_tokens_2080Ti_x4_more_DE

PromoGen_K562_GPT2_4096_tokens_2080Ti_x4_more_DE

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results