metadata

license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
model-index:
  - name: mt5-small-finetuned-b8-10-local
    results: []

mt5-small-finetuned-b8-10-local

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.5178
Rouge-1: 22.6249
Rouge-2: 7.9752
Rouge-l: 20.3546
Gen Len: 17.0843

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge-1	Rouge-2	Rouge-l
5.111	1.0	1356	12.3837	4.0081	18.105	6.0904	16.8637
4.5548	2.0	2712	3.8803	19.8119	6.9894	18.2397	14.1547
4.3667	3.0	4069	3.7301	21.0826	7.3597	19.1813	16.1043
4.1807	4.0	5426	3.6560	21.6212	7.5877	19.5831	16.7592
4.1524	5.0	6783	3.5911	22.0587	7.6506	19.9304	16.7987
4.1061	6.0	8140	3.5603	22.1661	7.7772	19.9915	16.8959
4.0028	7.0	9497	3.5431	22.6005	7.9698	20.3279	16.9174
3.9558	8.0	10854	3.5305	22.6074	7.9613	20.3267	17.0914
3.9647	9.0	12211	3.5207	22.5858	7.947	20.2764	17.0981
4.0044	9.99	13560	3.5178	22.6249	7.9752	20.3546	17.0843

Framework versions

Transformers 4.34.0
Pytorch 1.13.1+cu116
Datasets 2.14.5
Tokenizers 0.14.1