Edit model card

t5_small_en-pt

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5323
  • Bleu: 5.9538
  • Gen Len: 18.1281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 48
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 24 2.9907 1.063 18.0961
No log 2.0 48 2.7055 1.2952 18.1957
No log 3.0 72 2.5163 1.2143 18.2527
No log 4.0 96 2.3778 1.2343 18.2527
No log 5.0 120 2.2646 1.4193 18.2847
No log 6.0 144 2.1778 1.8966 18.1815
No log 7.0 168 2.0940 2.0599 18.2598
No log 8.0 192 2.0270 2.4341 18.2206
No log 9.0 216 1.9653 2.5973 18.1601
No log 10.0 240 1.9196 2.6454 18.2278
No log 11.0 264 1.8693 2.8137 18.1993
No log 12.0 288 1.8318 3.1498 18.1708
No log 13.0 312 1.7931 3.2767 18.1886
No log 14.0 336 1.7658 3.3551 18.1851
No log 15.0 360 1.7376 3.515 18.1708
No log 16.0 384 1.7149 3.7102 18.1851
No log 17.0 408 1.6890 3.5598 18.1637
No log 18.0 432 1.6707 3.7704 18.1744
No log 19.0 456 1.6535 3.8118 18.1459
No log 20.0 480 1.6374 3.9867 18.1922
2.1485 21.0 504 1.6210 4.1981 18.153
2.1485 22.0 528 1.6034 4.0626 18.1673
2.1485 23.0 552 1.5946 4.3269 18.1388
2.1485 24.0 576 1.5804 4.315 18.1673
2.1485 25.0 600 1.5721 4.759 18.1423
2.1485 26.0 624 1.5592 4.6125 18.1779
2.1485 27.0 648 1.5567 4.5445 18.1673
2.1485 28.0 672 1.5534 4.515 18.1352
2.1485 29.0 696 1.5414 4.4546 18.1815
2.1485 30.0 720 1.5364 4.6764 18.1886
2.1485 31.0 744 1.5335 4.8682 18.1601
2.1485 32.0 768 1.5230 4.9534 18.1388
2.1485 33.0 792 1.5241 4.8888 18.1139
2.1485 34.0 816 1.5147 5.0157 18.1459
2.1485 35.0 840 1.5125 5.1578 18.1388
2.1485 36.0 864 1.5114 5.0941 18.1459
2.1485 37.0 888 1.5146 5.194 18.121
2.1485 38.0 912 1.5081 5.254 18.1708
2.1485 39.0 936 1.5063 5.2011 18.1246
2.1485 40.0 960 1.5098 5.357 18.1139
2.1485 41.0 984 1.5026 5.318 18.1815
1.1831 42.0 1008 1.5079 5.4682 18.0996
1.1831 43.0 1032 1.5017 5.3502 18.1317
1.1831 44.0 1056 1.4985 5.5156 18.1139
1.1831 45.0 1080 1.4985 5.4698 18.1601
1.1831 46.0 1104 1.4965 5.2786 18.1246
1.1831 47.0 1128 1.4998 5.5736 18.1317
1.1831 48.0 1152 1.5045 5.5743 18.1673
1.1831 49.0 1176 1.4939 5.7078 18.1352
1.1831 50.0 1200 1.5055 5.5246 18.1566
1.1831 51.0 1224 1.5003 5.6179 18.153
1.1831 52.0 1248 1.4959 5.4944 18.1246
1.1831 53.0 1272 1.4996 5.4446 18.1139
1.1831 54.0 1296 1.5046 5.7323 18.1388
1.1831 55.0 1320 1.5004 5.6993 18.1352
1.1831 56.0 1344 1.4989 5.9024 18.1779
1.1831 57.0 1368 1.5073 5.7465 18.1673
1.1831 58.0 1392 1.5133 5.9312 18.1566
1.1831 59.0 1416 1.5051 5.7776 18.1673
1.1831 60.0 1440 1.5041 5.6764 18.1708
1.1831 61.0 1464 1.5158 5.7478 18.153
1.1831 62.0 1488 1.5069 5.7837 18.1352
0.8554 63.0 1512 1.5132 5.7428 18.1637
0.8554 64.0 1536 1.5153 5.9128 18.1673
0.8554 65.0 1560 1.5136 5.806 18.153
0.8554 66.0 1584 1.5076 5.8113 18.153
0.8554 67.0 1608 1.5087 5.8558 18.153
0.8554 68.0 1632 1.5160 5.783 18.1566
0.8554 69.0 1656 1.5131 5.8085 18.1708
0.8554 70.0 1680 1.5193 5.8694 18.1495
0.8554 71.0 1704 1.5165 5.8492 18.1352
0.8554 72.0 1728 1.5124 5.8414 18.1317
0.8554 73.0 1752 1.5231 5.9423 18.1281
0.8554 74.0 1776 1.5177 6.025 18.1352
0.8554 75.0 1800 1.5176 5.8698 18.1388
0.8554 76.0 1824 1.5201 5.818 18.121
0.8554 77.0 1848 1.5210 5.8352 18.1459
0.8554 78.0 1872 1.5199 5.9083 18.1495
0.8554 79.0 1896 1.5272 5.917 18.1317
0.8554 80.0 1920 1.5280 5.9053 18.1673
0.8554 81.0 1944 1.5241 6.0074 18.1566
0.8554 82.0 1968 1.5250 5.9686 18.1423
0.8554 83.0 1992 1.5237 6.0087 18.1388
0.6987 84.0 2016 1.5208 5.9024 18.1708
0.6987 85.0 2040 1.5255 5.8955 18.1708
0.6987 86.0 2064 1.5302 5.8841 18.1637
0.6987 87.0 2088 1.5306 5.9001 18.1459
0.6987 88.0 2112 1.5299 5.8831 18.1886
0.6987 89.0 2136 1.5269 5.8349 18.1886
0.6987 90.0 2160 1.5284 5.9442 18.1708
0.6987 91.0 2184 1.5301 5.9169 18.1637
0.6987 92.0 2208 1.5303 5.9544 18.1459
0.6987 93.0 2232 1.5293 5.8792 18.1566
0.6987 94.0 2256 1.5296 5.9409 18.1601
0.6987 95.0 2280 1.5294 5.9639 18.1495
0.6987 96.0 2304 1.5309 5.9787 18.1388
0.6987 97.0 2328 1.5322 5.9919 18.1246
0.6987 98.0 2352 1.5323 5.9572 18.1281
0.6987 99.0 2376 1.5324 5.9538 18.1281
0.6987 100.0 2400 1.5323 5.9538 18.1281

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rdsmaia/t5_small_en-pt

Base model

google-t5/t5-small
Finetuned
(1509)
this model

Dataset used to train rdsmaia/t5_small_en-pt

Evaluation results