gpt2english98

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 4.9145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
6.4337	0.0102	10000	6.3333
6.163	0.0204	20000	6.0858
6.0254	0.0306	30000	5.9577
5.9641	0.0408	40000	5.8612
5.9122	0.0510	50000	5.7882
5.7888	0.0612	60000	5.7222
5.7087	0.0714	70000	5.6873
5.7093	0.0816	80000	5.6355
5.6654	0.0918	90000	5.5949
5.6532	0.1020	100000	5.5734
5.6147	0.1122	110000	5.5327
5.6729	0.1224	120000	5.5168
5.5226	0.1327	130000	5.4808
5.4966	0.1429	140000	5.4611
5.4708	0.1531	150000	5.4348
5.4733	0.1633	160000	5.4144
5.4892	0.1735	170000	5.3892
5.5325	0.1837	180000	5.3728
5.4798	0.1939	190000	5.3587
5.4556	0.2041	200000	5.3404
5.3283	0.2143	210000	5.3255
5.3728	0.2245	220000	5.3096
5.42	0.2347	230000	5.3045
5.4695	0.2449	240000	5.2806
5.3529	0.2551	250000	5.2689
5.3567	0.2653	260000	5.2556
5.2927	0.2755	270000	5.2452
5.3838	0.2857	280000	5.2276
5.3734	0.2959	290000	5.2212
5.3353	0.3061	300000	5.2074
5.392	0.3163	310000	5.2042
5.3286	0.3265	320000	5.2012
5.3774	0.3367	330000	5.1871
5.2164	0.3469	340000	5.1758
5.3587	0.3571	350000	5.1685
5.3574	0.3673	360000	5.1575
5.2087	0.3776	370000	5.1482
5.1612	0.3878	380000	5.1489
5.225	0.3980	390000	5.1398
5.241	0.4082	400000	5.1222
5.2267	0.4184	410000	5.1288
5.1924	0.4286	420000	5.1110
5.2413	0.4388	430000	5.1047
5.2015	0.4490	440000	5.1049
5.2847	0.4592	450000	5.0944
5.1406	0.4694	460000	5.0888
5.1992	0.4796	470000	5.0786
5.0754	0.4898	480000	5.0810
5.1644	0.5	490000	5.0697
5.1464	0.5102	500000	5.0609
5.1771	0.5204	510000	5.0560
5.1896	0.5306	520000	5.0574
5.1355	0.5408	530000	5.0498
5.115	0.5510	540000	5.0494
5.1575	0.5612	550000	5.0357
5.191	0.5714	560000	5.0305
5.1694	0.5816	570000	5.0303
5.1591	0.5918	580000	5.0267
5.138	0.6020	590000	5.0264
5.0825	0.6122	600000	5.0195
5.1669	0.6224	610000	5.0147
5.0309	0.6327	620000	5.0156
5.0886	0.6429	630000	5.0077
5.1049	0.6531	640000	5.0021
5.1385	0.6633	650000	5.0052
5.1294	0.6735	660000	4.9955
5.0726	0.6837	670000	4.9947
5.1084	0.6939	680000	4.9912
5.0205	0.7041	690000	4.9869
5.111	0.7143	700000	4.9826
5.0809	0.7245	710000	4.9773
5.1221	0.7347	720000	4.9775
5.1516	0.7449	730000	4.9721
5.1347	0.7551	740000	4.9655
5.0744	0.7653	750000	4.9664
5.0715	0.7755	760000	4.9626
5.1118	0.7857	770000	4.9592
5.0933	0.7959	780000	4.9558
5.0685	0.8061	790000	4.9543
5.1237	0.8163	800000	4.9514
4.9532	0.8265	810000	4.9493
5.0854	0.8367	820000	4.9478
5.0865	0.8469	830000	4.9417
5.085	0.8571	840000	4.9419
5.0835	0.8673	850000	4.9385
5.0347	0.8776	860000	4.9345
4.9784	0.8878	870000	4.9332
5.0046	0.8980	880000	4.9317
4.9069	0.9082	890000	4.9296
5.0209	0.9184	900000	4.9270
5.1551	0.9286	910000	4.9234
5.1849	0.9388	920000	4.9230
5.07	0.9490	930000	4.9200
4.9804	0.9592	940000	4.9195
5.0419	0.9694	950000	4.9174
5.0447	0.9796	960000	4.9165
5.0839	0.9898	970000	4.9161
4.9989	1.0	980000	4.9145

Framework versions

Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

wolferobert3
/

gpt2english98

gpt2english98

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results