lapp0's picture
End of training
5f5092c verified
metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-small-query-expansion-merged
    results: []

flan-t5-small-query-expansion-merged

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0729
  • Rouge1: 88.0902
  • Rouge2: 86.3492
  • Rougel: 87.7337
  • Rougelsum: 87.9824
  • Gen Len: 18.3077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 16

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.6406 1.0 3377 0.6768 63.9968 45.7612 57.8086 61.5311 18.3873
0.7793 2.0 6754 0.5605 67.0163 49.7255 61.2364 64.6925 18.3231
1.0244 3.0 10131 0.4842 67.8219 51.5119 62.3029 65.6804 18.2080
0.5659 4.0 13508 0.4397 69.1529 53.8002 64.4153 67.2391 18.3712
0.7296 5.0 16885 0.3969 70.5914 56.0644 66.0627 68.576 18.1605
0.7259 6.0 20262 0.3626 70.8523 56.4451 66.252 69.1099 18.3231
0.6528 7.0 23639 0.3237 73.073 59.6605 68.7564 71.3906 18.2966
0.5374 8.0 27016 0.2677 74.5797 62.7906 70.8802 73.0946 18.2812
0.3949 9.0 30393 0.2195 77.0612 66.8027 73.9263 75.8907 18.2763
0.3018 10.0 33770 0.1636 79.9678 71.998 77.5129 78.9566 18.2394
0.2242 11.0 37147 0.1276 82.9401 77.1969 81.2458 82.3421 18.2924
0.1141 12.0 40524 0.0940 85.6963 81.8712 84.6628 85.3014 18.3105
0.087 13.0 43901 0.0816 86.9817 84.3464 86.2565 86.7104 18.3070
0.0375 14.0 47278 0.0739 87.9019 85.9691 87.4218 87.7412 18.3022
0.0356 15.0 50655 0.0726 88.0522 86.2944 87.6779 87.9371 18.3015
0.0302 16.0 54032 0.0729 88.0902 86.3492 87.7337 87.9824 18.3077

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2