sharkMeow's picture
End of training
a72237f
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
model-index:
  - name: mt5-small-finetuned-b8-10-local
    results: []

mt5-small-finetuned-b8-10-local

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5178
  • Rouge-1: 22.6249
  • Rouge-2: 7.9752
  • Rouge-l: 20.3546
  • Gen Len: 17.0843

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge-1 Rouge-2 Rouge-l
5.111 1.0 1356 12.3837 4.0081 18.105 6.0904 16.8637
4.5548 2.0 2712 3.8803 19.8119 6.9894 18.2397 14.1547
4.3667 3.0 4069 3.7301 21.0826 7.3597 19.1813 16.1043
4.1807 4.0 5426 3.6560 21.6212 7.5877 19.5831 16.7592
4.1524 5.0 6783 3.5911 22.0587 7.6506 19.9304 16.7987
4.1061 6.0 8140 3.5603 22.1661 7.7772 19.9915 16.8959
4.0028 7.0 9497 3.5431 22.6005 7.9698 20.3279 16.9174
3.9558 8.0 10854 3.5305 22.6074 7.9613 20.3267 17.0914
3.9647 9.0 12211 3.5207 22.5858 7.947 20.2764 17.0981
4.0044 9.99 13560 3.5178 22.6249 7.9752 20.3546 17.0843

Framework versions

  • Transformers 4.34.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.14.5
  • Tokenizers 0.14.1