--- tags: - generated_from_trainer model-index: - name: results_mt5_xl-sum results: [] --- # results_mt5_xl-sum This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.8431 - Rouge1 Fmeasure: 0.6139 - Rouge2 Fmeasure: 0.1189 - Rougel Fmeasure: 0.1997 - Meteor: 0.3315 - Bertscore F1: 0.8418 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0005 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 16 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 250 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 Fmeasure | Rouge2 Fmeasure | Rougel Fmeasure | Meteor | Bertscore F1 | |:-------------:|:------:|:----:|:---------------:|:---------------:|:---------------:|:---------------:|:------:|:------------:| | 2.6516 | 0.8529 | 500 | 0.9710 | 0.2668 | 0.0484 | 0.1537 | 0.2745 | 0.8284 | | 1.0475 | 1.7058 | 1000 | 0.8792 | 0.4289 | 0.0884 | 0.1737 | 0.2949 | 0.8278 | | 0.9413 | 2.5586 | 1500 | 0.8457 | 0.4960 | 0.0865 | 0.1898 | 0.3141 | 0.8339 | | 0.8711 | 3.4115 | 2000 | 0.8398 | 0.5400 | 0.1121 | 0.1941 | 0.3110 | 0.8397 | | 0.8235 | 4.2644 | 2500 | 0.8345 | 0.5587 | 0.1022 | 0.2041 | 0.3160 | 0.8388 | | 0.7797 | 5.1173 | 3000 | 0.8368 | 0.5735 | 0.1036 | 0.2044 | 0.3157 | 0.8344 | | 0.7401 | 5.9701 | 3500 | 0.8217 | 0.5507 | 0.1133 | 0.1936 | 0.3186 | 0.8366 | | 0.7022 | 6.8230 | 4000 | 0.8361 | 0.5808 | 0.1118 | 0.2008 | 0.3227 | 0.8406 | | 0.6796 | 7.6759 | 4500 | 0.8344 | 0.6173 | 0.1277 | 0.1986 | 0.3260 | 0.8407 | | 0.6523 | 8.5288 | 5000 | 0.8436 | 0.6232 | 0.1186 | 0.2024 | 0.3317 | 0.8398 | | 0.6385 | 9.3817 | 5500 | 0.8431 | 0.6139 | 0.1189 | 0.1997 | 0.3315 | 0.8418 | ### Framework versions - Transformers 4.40.0 - Pytorch 2.3.1+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1