pakawadeep's picture
Training in progress epoch 29
dea4494
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_keras_callback
model-index:
  - name: pakawadeep/mt5-small-finetuned-ctfl-augmented
    results: []

pakawadeep/mt5-small-finetuned-ctfl-augmented

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.9381
  • Validation Loss: 0.9416
  • Train Rouge1: 7.9915
  • Train Rouge2: 1.3861
  • Train Rougel: 7.9562
  • Train Rougelsum: 7.9915
  • Train Gen Len: 11.9653
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Rouge1 Train Rouge2 Train Rougel Train Rougelsum Train Gen Len Epoch
10.7577 3.2303 0.7164 0.0 0.7206 0.7052 16.1683 0
5.8846 1.9848 1.5545 0.0 1.5846 1.5614 16.7376 1
4.2789 1.7719 5.0684 0.7426 5.1391 5.1155 11.6386 2
3.4851 1.7527 5.2310 0.9406 5.2310 5.2074 11.1040 3
2.9892 1.7038 6.2235 0.8251 6.3225 6.2777 11.3812 4
2.6463 1.6585 8.2037 2.1782 8.2037 8.2037 11.4901 5
2.3897 1.5964 8.6987 2.1782 8.6987 8.4866 11.7030 6
2.1794 1.5112 8.6987 2.1782 8.6987 8.4866 11.8317 7
1.9896 1.4461 8.2037 2.1782 8.2037 8.2037 11.9059 8
1.8347 1.3770 8.2037 2.1782 8.2037 8.2037 11.9703 9
1.7101 1.3155 8.2037 2.1782 8.2037 8.2037 11.9505 10
1.6003 1.2598 8.6987 2.1782 8.6987 8.6987 11.9109 11
1.5041 1.2367 8.6987 2.1782 8.6987 8.6987 11.9505 12
1.4309 1.2286 8.6987 2.1782 8.6987 8.6987 11.9554 13
1.3618 1.1795 8.9109 2.3762 9.0877 8.9816 11.9554 14
1.3090 1.1625 8.9109 2.3762 9.0877 8.9816 11.9455 15
1.2669 1.1210 8.9109 2.3762 9.0877 8.9816 11.9554 16
1.2262 1.0769 8.6987 1.7822 8.7694 8.7341 11.9752 17
1.1915 1.0724 8.4512 1.3861 8.4866 8.4512 11.9752 18
1.1562 1.0444 8.4512 1.3861 8.4866 8.4512 11.9703 19
1.1291 1.0318 8.4512 1.3861 8.4866 8.4512 11.9653 20
1.1063 1.0321 8.4512 1.3861 8.4866 8.4512 11.9554 21
1.0786 1.0124 8.4512 1.3861 8.4866 8.4512 11.9604 22
1.0540 1.0062 8.4512 1.3861 8.4866 8.4512 11.9604 23
1.0241 0.9787 8.4512 1.3861 8.4866 8.4512 11.9653 24
1.0086 0.9683 8.4512 1.3861 8.4866 8.4512 11.9703 25
0.9883 0.9665 8.4512 1.3861 8.4866 8.4512 11.9653 26
0.9661 0.9521 8.4512 1.3861 8.4866 8.4512 11.9703 27
0.9543 0.9693 8.4512 1.3861 8.4866 8.4512 11.9703 28
0.9381 0.9416 7.9915 1.3861 7.9562 7.9915 11.9653 29

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2