ZhiguangHan
/

mt5-small-task1-dataset1

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

ZhiguangHan commited on Dec 8, 2023

Commit

3d59789

•

1 Parent(s): 91f98ac

End of training

Files changed (1) hide show

README.md +19 -14

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.6987
-- Rouge1: 0.5051
-- Rouge2: 0.1584
-- Rougel: 0.46
-- Rougelsum: 0.4594
 ## Model description
@@ -40,23 +40,28 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5.6e-05
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
-| 1.7426        | 1.0   | 500  | 1.8457          | 0.4597 | 0.1261 | 0.4121 | 0.4121    |
-| 1.6948        | 2.0   | 1000 | 1.7994          | 0.4827 | 0.145  | 0.435  | 0.4347    |
-| 1.7729        | 3.0   | 1500 | 1.7391          | 0.4949 | 0.1526 | 0.4522 | 0.4524    |
-| 1.8046        | 4.0   | 2000 | 1.7093          | 0.5028 | 0.1547 | 0.4578 | 0.4576    |
-| 1.7665        | 5.0   | 2500 | 1.6987          | 0.5051 | 0.1584 | 0.46   | 0.4594    |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7125
+- Rouge1: 0.5005
+- Rouge2: 0.1542
+- Rougel: 0.4577
+- Rougelsum: 0.4587
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5.5e-05
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 2.7151        | 1.0   | 250  | 2.2431          | 0.3662 | 0.0891 | 0.3557 | 0.3556    |
+| 2.4198        | 2.0   | 500  | 2.0873          | 0.3997 | 0.1027 | 0.3884 | 0.3883    |
+| 2.2232        | 3.0   | 750  | 2.0082          | 0.4453 | 0.1309 | 0.4201 | 0.4203    |
+| 2.0842        | 4.0   | 1000 | 1.9100          | 0.4663 | 0.1467 | 0.4275 | 0.4274    |
+| 1.9825        | 5.0   | 1250 | 1.8493          | 0.4671 | 0.1457 | 0.4228 | 0.4228    |
+| 1.9048        | 6.0   | 1500 | 1.7759          | 0.49   | 0.1545 | 0.4503 | 0.4508    |
+| 1.8606        | 7.0   | 1750 | 1.7438          | 0.4996 | 0.1577 | 0.4575 | 0.4585    |
+| 1.8208        | 8.0   | 2000 | 1.7236          | 0.4975 | 0.1533 | 0.4555 | 0.4556    |
+| 1.788         | 9.0   | 2250 | 1.7200          | 0.4983 | 0.156  | 0.4566 | 0.4572    |
+| 1.7799        | 10.0  | 2500 | 1.7125          | 0.5005 | 0.1542 | 0.4577 | 0.4587    |
 ### Framework versions