ZhiguangHan commited on
Commit
3d59789
1 Parent(s): 91f98ac

End of training

Browse files
Files changed (1) hide show
  1. README.md +19 -14
README.md CHANGED
@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.6987
21
- - Rouge1: 0.5051
22
- - Rouge2: 0.1584
23
- - Rougel: 0.46
24
- - Rougelsum: 0.4594
25
 
26
  ## Model description
27
 
@@ -40,23 +40,28 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 5.6e-05
44
- - train_batch_size: 8
45
- - eval_batch_size: 8
46
  - seed: 42
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
- - num_epochs: 5
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
54
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
55
- | 1.7426 | 1.0 | 500 | 1.8457 | 0.4597 | 0.1261 | 0.4121 | 0.4121 |
56
- | 1.6948 | 2.0 | 1000 | 1.7994 | 0.4827 | 0.145 | 0.435 | 0.4347 |
57
- | 1.7729 | 3.0 | 1500 | 1.7391 | 0.4949 | 0.1526 | 0.4522 | 0.4524 |
58
- | 1.8046 | 4.0 | 2000 | 1.7093 | 0.5028 | 0.1547 | 0.4578 | 0.4576 |
59
- | 1.7665 | 5.0 | 2500 | 1.6987 | 0.5051 | 0.1584 | 0.46 | 0.4594 |
 
 
 
 
 
60
 
61
 
62
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.7125
21
+ - Rouge1: 0.5005
22
+ - Rouge2: 0.1542
23
+ - Rougel: 0.4577
24
+ - Rougelsum: 0.4587
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 5.5e-05
44
+ - train_batch_size: 16
45
+ - eval_batch_size: 16
46
  - seed: 42
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: linear
49
+ - num_epochs: 10
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
54
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
55
+ | 2.7151 | 1.0 | 250 | 2.2431 | 0.3662 | 0.0891 | 0.3557 | 0.3556 |
56
+ | 2.4198 | 2.0 | 500 | 2.0873 | 0.3997 | 0.1027 | 0.3884 | 0.3883 |
57
+ | 2.2232 | 3.0 | 750 | 2.0082 | 0.4453 | 0.1309 | 0.4201 | 0.4203 |
58
+ | 2.0842 | 4.0 | 1000 | 1.9100 | 0.4663 | 0.1467 | 0.4275 | 0.4274 |
59
+ | 1.9825 | 5.0 | 1250 | 1.8493 | 0.4671 | 0.1457 | 0.4228 | 0.4228 |
60
+ | 1.9048 | 6.0 | 1500 | 1.7759 | 0.49 | 0.1545 | 0.4503 | 0.4508 |
61
+ | 1.8606 | 7.0 | 1750 | 1.7438 | 0.4996 | 0.1577 | 0.4575 | 0.4585 |
62
+ | 1.8208 | 8.0 | 2000 | 1.7236 | 0.4975 | 0.1533 | 0.4555 | 0.4556 |
63
+ | 1.788 | 9.0 | 2250 | 1.7200 | 0.4983 | 0.156 | 0.4566 | 0.4572 |
64
+ | 1.7799 | 10.0 | 2500 | 1.7125 | 0.5005 | 0.1542 | 0.4577 | 0.4587 |
65
 
66
 
67
  ### Framework versions