silmi224's picture
Training complete
a8e4b56 verified
metadata
base_model: silmi224/finetune-led-35000
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: exp2-led-risalah_data_v7-fix
    results: []

Visualize in Weights & Biases

exp2-led-risalah_data_v7-fix

This model is a fine-tuned version of silmi224/finetune-led-35000 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6801
  • Rouge1: 20.0364
  • Rouge2: 9.57
  • Rougel: 13.9743
  • Rougelsum: 14.0563

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.8706 1.0 10 3.3282 9.2634 1.825 6.2857 6.6749
3.5173 2.0 20 2.8713 9.381 1.5365 6.5965 6.6722
3.0587 3.0 30 2.5101 12.3761 3.5034 8.6155 8.7913
2.7254 4.0 40 2.2919 14.8916 4.9071 10.0 9.9487
2.504 5.0 50 2.1490 14.5316 4.9407 9.6973 9.5973
2.3306 6.0 60 2.0516 15.6234 5.419 10.6929 10.671
2.1991 7.0 70 1.9705 16.9222 6.1531 10.3785 10.4171
2.0922 8.0 80 1.9114 15.9531 6.007 10.2455 10.2734
2.0108 9.0 90 1.8601 16.3146 6.2786 10.632 10.6027
1.9243 10.0 100 1.8352 18.1771 6.6919 11.1811 11.2366
1.8675 11.0 110 1.7865 17.2554 7.4135 10.5322 10.5689
1.8066 12.0 120 1.7520 15.8483 7.1825 10.7059 10.7344
1.7476 13.0 130 1.7341 16.0049 6.6876 10.9744 10.9918
1.6911 14.0 140 1.7126 17.6921 8.9076 12.8474 12.8966
1.6388 15.0 150 1.6960 19.7192 9.1168 13.3649 13.3949
1.5902 16.0 160 1.6783 20.7583 9.7459 14.1533 14.1794
1.5433 17.0 170 1.6476 19.4203 9.4624 13.3403 13.401
1.4992 18.0 180 1.6450 18.74 8.8791 13.3925 13.3709
1.4614 19.0 190 1.6335 19.476 9.0282 13.5223 13.4966
1.4216 20.0 200 1.6246 17.6435 7.9777 13.1255 13.1599
1.3842 21.0 210 1.6102 18.6282 8.511 12.8825 12.7954
1.3479 22.0 220 1.6200 18.066 8.4414 12.467 12.4232
1.3087 23.0 230 1.6350 17.8312 8.6603 12.522 12.511
1.2752 24.0 240 1.6186 18.5374 9.7206 13.0955 13.0266
1.2434 25.0 250 1.6219 18.232 7.9904 12.7029 12.6916
1.2046 26.0 260 1.6393 17.4585 7.2075 12.5202 12.4766
1.1716 27.0 270 1.6139 19.6477 9.9919 14.3408 14.346
1.1388 28.0 280 1.6416 19.7279 8.8207 13.6708 13.7072
1.1083 29.0 290 1.6485 19.1252 9.2133 13.6003 13.6412
1.0745 30.0 300 1.6801 20.0364 9.57 13.9743 14.0563

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1