roequitz's picture
End of training
da2db5f verified
metadata
license: apache-2.0
base_model: google-t5/t5-base
tags:
  - generated_from_trainer
model-index:
  - name: t5-abs-2309-1054-lr-0.0001-bs-10-maxep-20
    results: []

t5-abs-2309-1054-lr-0.0001-bs-10-maxep-20

This model is a fine-tuned version of google-t5/t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9472
  • Rouge/rouge1: 0.4676
  • Rouge/rouge2: 0.2222
  • Rouge/rougel: 0.4004
  • Rouge/rougelsum: 0.4024
  • Bertscore/bertscore-precision: 0.8963
  • Bertscore/bertscore-recall: 0.8971
  • Bertscore/bertscore-f1: 0.8965
  • Meteor: 0.4301
  • Gen Len: 40.8455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 20
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge/rouge1 Rouge/rouge2 Rouge/rougel Rouge/rougelsum Bertscore/bertscore-precision Bertscore/bertscore-recall Bertscore/bertscore-f1 Meteor Gen Len
2.79 0.9885 43 2.0712 0.4222 0.1746 0.3527 0.3532 0.8934 0.886 0.8895 0.3645 37.0818
1.9564 2.0 87 1.7989 0.445 0.202 0.3739 0.3753 0.8971 0.8893 0.893 0.3978 36.1545
1.7565 2.9885 130 1.7408 0.4648 0.2233 0.3974 0.3992 0.8987 0.8919 0.8952 0.416 36.3818
1.5584 4.0 174 1.7210 0.4529 0.2075 0.3814 0.3835 0.8969 0.8919 0.8942 0.4083 37.7545
1.4732 4.9885 217 1.7176 0.4639 0.2163 0.3949 0.3968 0.8978 0.8942 0.8958 0.4189 38.0727
1.3447 6.0 261 1.7308 0.4541 0.208 0.3856 0.3877 0.8962 0.8934 0.8947 0.4111 38.8636
1.2905 6.9885 304 1.7456 0.4584 0.2067 0.3851 0.3868 0.8975 0.8934 0.8953 0.4114 37.0909
1.1838 8.0 348 1.7674 0.4636 0.2164 0.3945 0.3961 0.8951 0.8969 0.8958 0.4287 41.3545
1.1479 8.9885 391 1.7794 0.4721 0.2233 0.4014 0.4037 0.8957 0.8964 0.8959 0.4378 41.2545
1.067 10.0 435 1.8149 0.4567 0.2127 0.3968 0.3984 0.895 0.8935 0.894 0.4188 38.9182
1.0456 10.9885 478 1.8434 0.4585 0.208 0.3894 0.3913 0.8964 0.8936 0.8948 0.4136 37.8364
0.9792 12.0 522 1.8381 0.466 0.2163 0.3976 0.3996 0.8962 0.8962 0.896 0.4272 40.3455
0.9618 12.9885 565 1.8834 0.4702 0.2214 0.3996 0.4023 0.8949 0.8978 0.8962 0.441 42.6
0.9077 14.0 609 1.8886 0.4664 0.2221 0.4001 0.4014 0.8958 0.8969 0.8962 0.433 41.7364
0.9053 14.9885 652 1.9082 0.4687 0.2231 0.4016 0.4043 0.8967 0.898 0.8972 0.4341 41.7182
0.8627 16.0 696 1.9271 0.4564 0.2097 0.3858 0.3869 0.8939 0.8957 0.8946 0.4231 41.7545
0.866 16.9885 739 1.9276 0.4615 0.2129 0.3936 0.3955 0.8945 0.8971 0.8957 0.4273 42.0091
0.8359 18.0 783 1.9376 0.4644 0.2186 0.3995 0.4012 0.8947 0.8966 0.8955 0.4309 41.7091
0.8412 18.9885 826 1.9467 0.4691 0.2222 0.4018 0.4035 0.8959 0.898 0.8968 0.4327 41.6909
0.8218 19.7701 860 1.9472 0.4676 0.2222 0.4004 0.4024 0.8963 0.8971 0.8965 0.4301 40.8455

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1