NitzanBar's picture
Model save
85fa519 verified
metadata
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
metrics:
  - rouge
base_model: Stancld/longt5-tglobal-large-16384-pubmed-3k_steps
model-index:
  - name: t5_long_27-03-2024_14-47-48
    results: []

t5_long_27-03-2024_14-47-48

This model is a fine-tuned version of Stancld/longt5-tglobal-large-16384-pubmed-3k_steps on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3435
  • Rouge1: 18.0761
  • Rouge2: 6.2848
  • Rougel: 15.9805
  • Rougelsum: 16.9344

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
18.5106 0.01 1 2.2998 9.0413 1.7951 7.3243 8.0357
8.5122 0.01 2 0.5275 0.0 0.0 0.0 0.0
10.8448 0.02 3 0.6630 0.0 0.0 0.0 0.0
9.4742 0.03 4 0.6472 0.0 0.0 0.0 0.0
8.8401 0.03 5 0.5541 0.0 0.0 0.0 0.0
5.2721 0.04 6 0.5268 0.0 0.0 0.0 0.0
7.4134 0.05 7 0.5197 0.0 0.0 0.0 0.0
8.1236 0.05 8 0.5084 0.0 0.0 0.0 0.0
4.9603 0.06 9 0.4955 0.0 0.0 0.0 0.0
7.9812 0.07 10 0.4838 0.0 0.0 0.0 0.0
10.1557 0.07 11 0.4729 0.0 0.0 0.0 0.0
9.943 0.08 12 0.4623 0.0 0.0 0.0 0.0
8.4195 0.09 13 0.4515 0.0 0.0 0.0 0.0
9.0108 0.09 14 0.4419 0.0 0.0 0.0 0.0
7.8627 0.1 15 0.4339 0.0 0.0 0.0 0.0
8.1388 0.11 16 0.4271 0.0 0.0 0.0 0.0
7.8132 0.11 17 0.4223 0.0 0.0 0.0 0.0
6.083 0.12 18 0.4186 0.0 0.0 0.0 0.0
11.2734 0.13 19 0.4137 0.0 0.0 0.0 0.0
6.004 0.13 20 0.4082 0.0 0.0 0.0 0.0
8.7849 0.14 21 0.4019 0.0 0.0 0.0 0.0
5.829 0.15 22 0.3976 0.0 0.0 0.0 0.0
7.0927 0.15 23 0.3929 0.0 0.0 0.0 0.0
7.5678 0.16 24 0.3887 0.0 0.0 0.0 0.0
7.1876 0.17 25 0.3848 0.0 0.0 0.0 0.0
8.5662 0.17 26 0.3824 0.0 0.0 0.0 0.0
9.3966 0.18 27 0.3804 0.0 0.0 0.0 0.0
10.1809 0.19 28 0.3780 0.0 0.0 0.0 0.0
8.0879 0.19 29 0.3765 0.0 0.0 0.0 0.0
7.5633 0.2 30 0.3749 0.0 0.0 0.0 0.0
7.277 0.21 31 0.3738 0.0 0.0 0.0 0.0
6.6679 0.21 32 0.3710 0.0 0.0 0.0 0.0
11.7409 0.22 33 0.3680 0.0 0.0 0.0 0.0
7.5637 0.23 34 0.3658 0.0 0.0 0.0 0.0
9.0556 0.23 35 0.3623 0.0 0.0 0.0 0.0
9.5907 0.24 36 0.3615 0.0 0.0 0.0 0.0
9.0023 0.25 37 0.3604 0.0 0.0 0.0 0.0
7.8242 0.25 38 0.3599 0.0 0.0 0.0 0.0
9.2029 0.26 39 0.3591 0.0 0.0 0.0 0.0
7.6971 0.27 40 0.3566 0.0 0.0 0.0 0.0
8.4237 0.27 41 0.3542 0.0 0.0 0.0 0.0
6.9863 0.28 42 0.3521 0.0 0.0 0.0 0.0
6.7574 0.29 43 0.3503 0.0 0.0 0.0 0.0
5.1329 0.29 44 0.3490 0.0 0.0 0.0 0.0
9.0936 0.3 45 0.3483 1.3799 0.4852 1.0924 1.2108
9.8391 0.31 46 0.3475 13.0878 4.5581 11.0376 11.8991
6.4842 0.31 47 0.3463 17.4226 6.4078 15.3093 16.2477
6.4921 0.32 48 0.3452 18.5772 6.8275 16.3128 17.2935
10.4488 0.33 49 0.3443 18.2004 6.5001 16.0497 17.0731
4.3364 0.33 50 0.3435 18.0761 6.2848 15.9805 16.9344
9.4075 0.34 51 0.3427 18.3684 6.4989 16.2721 17.2338
13.213 0.35 52 0.3423 18.1592 6.1019 16.0076 17.061
8.5205 0.35 53 0.3420 17.6529 5.8026 15.4966 16.4479
8.6332 0.36 54 0.3411 18.1603 6.1679 15.9369 16.9204
8.288 0.37 55 0.3404 18.3122 6.1727 15.9244 16.9652

Framework versions

  • PEFT 0.10.0
  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2