baseline_large / README.md
samzirbo's picture
End of training
c34de39 verified
metadata
base_model: samzirbo/mT5.en-es.pretrained
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: baseline
    results: []

baseline

This model is a fine-tuned version of samzirbo/mT5.en-es.pretrained on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1447
  • Bleu: 44.0055
  • Meteor: 0.6899
  • Chrf++: 62.7408

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 50000

Training results

Training Loss Epoch Step Validation Loss Bleu Meteor Chrf++
4.4025 0.15 2500 2.0103 27.2729 0.5485 48.5595
2.4711 0.29 5000 1.7003 33.5151 0.6064 54.1742
2.2228 0.44 7500 1.5685 35.7225 0.6241 56.257
2.0833 0.59 10000 1.4797 37.676 0.6383 57.6397
1.9841 0.73 12500 1.4128 38.5011 0.6504 58.509
1.9135 0.88 15000 1.3693 39.7405 0.6569 59.5371
1.8531 1.03 17500 1.3192 40.6354 0.6638 59.992
1.784 1.17 20000 1.2890 41.5264 0.6716 60.8105
1.7506 1.32 22500 1.2587 42.0462 0.6737 61.1679
1.7214 1.47 25000 1.2359 42.1492 0.6755 61.379
1.698 1.61 27500 1.2125 42.5233 0.6794 61.5581
1.6715 1.76 30000 1.1970 42.7034 0.6805 61.7294
1.6526 1.91 32500 1.1849 43.0685 0.6834 62.0592
1.6257 2.05 35000 1.1699 43.2808 0.6855 62.1626
1.5914 2.2 37500 1.1627 43.3637 0.685 62.2303
1.5818 2.35 40000 1.1545 43.6077 0.6874 62.4906
1.5811 2.49 42500 1.1484 43.9335 0.6891 62.6396
1.5777 2.64 45000 1.1449 44.1036 0.6903 62.8018
1.575 2.79 47500 1.1450 43.9408 0.6894 62.6836
1.5766 2.93 50000 1.1447 44.0055 0.6899 62.7408

Framework versions

  • Transformers 4.38.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.15.2