ViT-Bert_Mimic

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.0684	1.0	7500	0.0752	34.4312	25.586	34.2067	34.2816	14.065
0.0626	2.0	15000	0.0694	38.0498	26.9882	37.2064	37.6682	19.492
0.0599	3.0	22500	0.0676	37.9403	26.7796	37.0514	37.571	21.805
0.054	4.0	30000	0.0661	38.1215	26.8065	37.3608	37.7763	18.883
0.0484	5.0	37500	0.0658	39.0689	27.489	38.0601	38.8175	20.556
0.043	6.0	45000	0.0679	38.5537	26.6503	37.4722	38.1314	20.994
0.0378	7.0	52500	0.0701	37.8821	26.1994	36.7872	37.4123	19.978
0.0324	8.0	60000	0.0741	38.5791	26.2187	37.3411	38.0767	21.761
0.0269	9.0	67500	0.0787	36.2698	24.3513	35.1553	35.7864	20.512
0.0199	10.0	75000	0.0848	34.8266	22.0111	33.591	34.3348	19.67
0.0158	11.0	82500	0.0921	34.5083	21.5876	33.273	34.0396	20.663
0.0114	12.0	90000	0.0990	33.6601	20.3509	32.3799	33.1785	21.574
0.0078	13.0	97500	0.1057	33.5222	20.262	32.3084	33.0449	20.7
0.0057	14.0	105000	0.1122	32.9482	19.0875	31.6809	32.4176	21.562
0.0037	15.0	112500	0.1172	33.2572	19.0712	31.8675	32.7193	21.432
0.0027	16.0	120000	0.1215	34.0583	20.5815	32.5961	33.4699	21.379
0.0019	17.0	127500	0.1257	34.3046	21.1929	33.0026	33.6992	20.687
0.0013	18.0	135000	0.1280	34.9621	21.8578	33.6017	34.3908	21.249
0.001	19.0	142500	0.1298	35.1328	21.8242	33.7634	34.5288	20.567
0.0007	20.0	150000	0.1305	34.725	21.4916	33.3614	34.1142	20.706