xjlulu's picture
Update README.md
2a11e7a
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
model-index:
  - name: ntu_adl_summarization_mt5_s
    results: []
datasets:
  - xjlulu/ntu_adl_summarization
language:
  - zh
metrics:
  - rouge
pipeline_tag: summarization

ntu_adl_summarization_mt5_s

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6583
  • Rouge-1: 21.9729
  • Rouge-2: 7.6735
  • Rouge-l: 19.7497
  • Ave Gen Len: 17.3098

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge-1 Rouge-2 Rouge-l Ave Gen Len
5.4447 1.0 1357 4.1235 17.7916 5.9785 16.5599 12.7161
4.7463 2.0 2714 3.9569 19.6608 6.7631 18.0768 14.8245
4.5203 3.0 4071 3.8545 20.5626 7.0737 18.7628 16.3307
4.4285 4.0 5428 3.7825 21.0690 7.2030 19.0863 16.7841
4.3196 5.0 6785 3.7269 21.2881 7.3307 19.2588 16.9276
4.2662 6.0 8142 3.7027 21.5793 7.5122 19.4806 17.0333
4.2057 7.0 9499 3.6764 21.7949 7.5987 19.6082 17.1811
4.1646 8.0 10856 3.6671 21.8164 7.5705 19.6207 17.2550
4.1399 9.0 12213 3.6602 21.9381 7.6577 19.7089 17.3014
4.1479 10.0 13570 3.6583 21.9729 7.6735 19.7497 17.3098

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.5
  • Tokenizers 0.14.1