t5_small_en-pt

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.5323
Bleu: 5.9538
Gen Len: 18.1281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 48
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	24	2.9907	1.063	18.0961
No log	2.0	48	2.7055	1.2952	18.1957
No log	3.0	72	2.5163	1.2143	18.2527
No log	4.0	96	2.3778	1.2343	18.2527
No log	5.0	120	2.2646	1.4193	18.2847
No log	6.0	144	2.1778	1.8966	18.1815
No log	7.0	168	2.0940	2.0599	18.2598
No log	8.0	192	2.0270	2.4341	18.2206
No log	9.0	216	1.9653	2.5973	18.1601
No log	10.0	240	1.9196	2.6454	18.2278
No log	11.0	264	1.8693	2.8137	18.1993
No log	12.0	288	1.8318	3.1498	18.1708
No log	13.0	312	1.7931	3.2767	18.1886
No log	14.0	336	1.7658	3.3551	18.1851
No log	15.0	360	1.7376	3.515	18.1708
No log	16.0	384	1.7149	3.7102	18.1851
No log	17.0	408	1.6890	3.5598	18.1637
No log	18.0	432	1.6707	3.7704	18.1744
No log	19.0	456	1.6535	3.8118	18.1459
No log	20.0	480	1.6374	3.9867	18.1922
2.1485	21.0	504	1.6210	4.1981	18.153
2.1485	22.0	528	1.6034	4.0626	18.1673
2.1485	23.0	552	1.5946	4.3269	18.1388
2.1485	24.0	576	1.5804	4.315	18.1673
2.1485	25.0	600	1.5721	4.759	18.1423
2.1485	26.0	624	1.5592	4.6125	18.1779
2.1485	27.0	648	1.5567	4.5445	18.1673
2.1485	28.0	672	1.5534	4.515	18.1352
2.1485	29.0	696	1.5414	4.4546	18.1815
2.1485	30.0	720	1.5364	4.6764	18.1886
2.1485	31.0	744	1.5335	4.8682	18.1601
2.1485	32.0	768	1.5230	4.9534	18.1388
2.1485	33.0	792	1.5241	4.8888	18.1139
2.1485	34.0	816	1.5147	5.0157	18.1459
2.1485	35.0	840	1.5125	5.1578	18.1388
2.1485	36.0	864	1.5114	5.0941	18.1459
2.1485	37.0	888	1.5146	5.194	18.121
2.1485	38.0	912	1.5081	5.254	18.1708
2.1485	39.0	936	1.5063	5.2011	18.1246
2.1485	40.0	960	1.5098	5.357	18.1139
2.1485	41.0	984	1.5026	5.318	18.1815
1.1831	42.0	1008	1.5079	5.4682	18.0996
1.1831	43.0	1032	1.5017	5.3502	18.1317
1.1831	44.0	1056	1.4985	5.5156	18.1139
1.1831	45.0	1080	1.4985	5.4698	18.1601
1.1831	46.0	1104	1.4965	5.2786	18.1246
1.1831	47.0	1128	1.4998	5.5736	18.1317
1.1831	48.0	1152	1.5045	5.5743	18.1673
1.1831	49.0	1176	1.4939	5.7078	18.1352
1.1831	50.0	1200	1.5055	5.5246	18.1566
1.1831	51.0	1224	1.5003	5.6179	18.153
1.1831	52.0	1248	1.4959	5.4944	18.1246
1.1831	53.0	1272	1.4996	5.4446	18.1139
1.1831	54.0	1296	1.5046	5.7323	18.1388
1.1831	55.0	1320	1.5004	5.6993	18.1352
1.1831	56.0	1344	1.4989	5.9024	18.1779
1.1831	57.0	1368	1.5073	5.7465	18.1673
1.1831	58.0	1392	1.5133	5.9312	18.1566
1.1831	59.0	1416	1.5051	5.7776	18.1673
1.1831	60.0	1440	1.5041	5.6764	18.1708
1.1831	61.0	1464	1.5158	5.7478	18.153
1.1831	62.0	1488	1.5069	5.7837	18.1352
0.8554	63.0	1512	1.5132	5.7428	18.1637
0.8554	64.0	1536	1.5153	5.9128	18.1673
0.8554	65.0	1560	1.5136	5.806	18.153
0.8554	66.0	1584	1.5076	5.8113	18.153
0.8554	67.0	1608	1.5087	5.8558	18.153
0.8554	68.0	1632	1.5160	5.783	18.1566
0.8554	69.0	1656	1.5131	5.8085	18.1708
0.8554	70.0	1680	1.5193	5.8694	18.1495
0.8554	71.0	1704	1.5165	5.8492	18.1352
0.8554	72.0	1728	1.5124	5.8414	18.1317
0.8554	73.0	1752	1.5231	5.9423	18.1281
0.8554	74.0	1776	1.5177	6.025	18.1352
0.8554	75.0	1800	1.5176	5.8698	18.1388
0.8554	76.0	1824	1.5201	5.818	18.121
0.8554	77.0	1848	1.5210	5.8352	18.1459
0.8554	78.0	1872	1.5199	5.9083	18.1495
0.8554	79.0	1896	1.5272	5.917	18.1317
0.8554	80.0	1920	1.5280	5.9053	18.1673
0.8554	81.0	1944	1.5241	6.0074	18.1566
0.8554	82.0	1968	1.5250	5.9686	18.1423
0.8554	83.0	1992	1.5237	6.0087	18.1388
0.6987	84.0	2016	1.5208	5.9024	18.1708
0.6987	85.0	2040	1.5255	5.8955	18.1708
0.6987	86.0	2064	1.5302	5.8841	18.1637
0.6987	87.0	2088	1.5306	5.9001	18.1459
0.6987	88.0	2112	1.5299	5.8831	18.1886
0.6987	89.0	2136	1.5269	5.8349	18.1886
0.6987	90.0	2160	1.5284	5.9442	18.1708
0.6987	91.0	2184	1.5301	5.9169	18.1637
0.6987	92.0	2208	1.5303	5.9544	18.1459
0.6987	93.0	2232	1.5293	5.8792	18.1566
0.6987	94.0	2256	1.5296	5.9409	18.1601
0.6987	95.0	2280	1.5294	5.9639	18.1495
0.6987	96.0	2304	1.5309	5.9787	18.1388
0.6987	97.0	2328	1.5322	5.9919	18.1246
0.6987	98.0	2352	1.5323	5.9572	18.1281
0.6987	99.0	2376	1.5324	5.9538	18.1281
0.6987	100.0	2400	1.5323	5.9538	18.1281

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu117
Datasets 2.13.1
Tokenizers 0.13.3

rdsmaia
/

t5_small_en-pt

t5_small_en-pt

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rdsmaia/t5_small_en-pt

Dataset used to train rdsmaia/t5_small_en-pt

Evaluation results