learn3r
/

longt5_xl_summ_screen_bp_only_30

@@ -1,24 +1,11 @@
 ---
-base_model: /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210
 tags:
 - generated_from_trainer
-datasets:
-- learn3r/summ_screen_fd_bp
 metrics:
 - rouge
 model-index:
 - name: longt5_xl_summ_screen_bp_only_30
-  results:
-  - task:
-      name: Summarization
-      type: summarization
-    dataset:
-      name: learn3r/summ_screen_fd_bp
-      type: learn3r/summ_screen_fd_bp
-    metrics:
-    - name: Rouge1
-      type: rouge
-      value: 40.4943
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -26,14 +13,14 @@ should probably proofread and complete it, then remove this comment. -->
 # longt5_xl_summ_screen_bp_only_30
-This model is a fine-tuned version of [/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210](https://huggingface.co//exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen_bp_only/checkpoint-210) on the learn3r/summ_screen_fd_bp dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2397
-- Rouge1: 40.4943
-- Rouge2: 16.4695
-- Rougel: 28.0964
-- Rougelsum: 38.3693
-- Gen Len: 246.3491
 ## Model description
@@ -56,30 +43,31 @@ The following hyperparameters were used during training:
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
-- distributed_type: multi-GPU
-- num_devices: 4
 - gradient_accumulation_steps: 32
-- total_train_batch_size: 1024
-- total_eval_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - num_epochs: 15.0
 ### Training results
-| Training Loss | Epoch | Step | Gen Len  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
-|:-------------:|:-----:|:----:|:--------:|:---------------:|:-------:|:-------:|:-------:|:---------:|
-| 0.324         | 0.97  | 14   | 246.7396 | 2.2376          | 40.4388 | 16.4662 | 28.0771 | 38.3405   |
-| 0.324         | 4.83  | 15   | 2.5727   | 30.0123         | 12.3701 | 21.2834 | 28.891  | 503.5651  |
-| 0.3036        | 5.95  | 19   | 2.2659   | 27.8421         | 11.1942 | 20.4713 | 26.6097 | 506.9527  |
-| 0.2941        | 6.78  | 22   | 2.2636   | 40.8304         | 17.3615 | 28.0971 | 39.0943 | 284.2308  |
-| 0.2642        | 7.9   | 26   | 2.2864   | 38.3377         | 15.8119 | 26.4838 | 36.5174 | 341.2515  |
-| 0.2604        | 8.73  | 29   | 2.4551   | 33.2021         | 13.6577 | 23.3288 | 31.8326 | 435.2633  |
-| 0.2237        | 9.84  | 33   | 2.6153   | 40.3297         | 15.3786 | 28.1208 | 38.2426 | 234.6124  |
-| 0.1904        | 10.96 | 37   | 2.6665   | 39.6006         | 14.9586 | 27.2453 | 37.6744 | 174.5740  |
-| 0.2247        | 11.79 | 40   | 2.7224   | 30.5957         | 13.3496 | 21.9712 | 29.22   | 500.5828  |
-| 0.182         | 12.9  | 44   | 3.2715   | 41.6828         | 17.0818 | 28.087  | 39.5947 | 259.6568  |
-| 0.182         | 13.18 | 45   | 2.3973   | 31.9833         | 14.0141 | 22.6823 | 30.6424 | 484.3964  |
 ### Framework versions

 ---
 tags:
 - generated_from_trainer
 metrics:
 - rouge
 model-index:
 - name: longt5_xl_summ_screen_bp_only_30
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # longt5_xl_summ_screen_bp_only_30
+This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.4990
+- Rouge1: 32.5815
+- Rouge2: 14.2951
+- Rougel: 22.4501
+- Rougelsum: 31.2928
+- Gen Len: 499.3107
 ## Model description
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 32
+- total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - num_epochs: 15.0
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len  |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
+| 0.324         | 0.97  | 14   | 2.2376          | 40.4388 | 16.4662 | 28.0771 | 38.3405   | 246.7396 |
+| 0.2707        | 1.95  | 28   | 2.3204          | 40.2873 | 16.7641 | 27.3895 | 38.2689   | 307.3787 |
+| 0.2217        | 2.99  | 43   | 2.5281          | 31.9916 | 13.8136 | 22.1895 | 30.623    | 501.9320 |
+| 0.1776        | 3.97  | 57   | 2.7530          | 31.7535 | 13.8852 | 22.8653 | 30.3796   | 489.6183 |
+| 0.1424        | 4.94  | 71   | 2.6578          | 32.117  | 14.2141 | 22.3733 | 30.8328   | 502.1124 |
+| 0.1449        | 5.98  | 86   | 2.5508          | 35.3448 | 13.8478 | 24.9044 | 33.6108   | 357.3136 |
+| 0.1191        | 6.96  | 100  | 3.1622          | 37.2189 | 16.0076 | 25.7011 | 35.294    | 408.8669 |
+| 0.0879        | 8.0   | 115  | 2.8510          | 39.8825 | 16.8073 | 27.2428 | 37.9568   | 318.2278 |
+| 0.0899        | 8.97  | 129  | 2.9138          | 31.7139 | 13.7066 | 21.8844 | 30.5075   | 500.4053 |
+| 0.0656        | 9.95  | 143  | 3.1616          | 33.055  | 14.5841 | 22.5883 | 31.7565   | 488.1686 |
+| 0.0542        | 10.99 | 158  | 3.3630          | 43.7514 | 18.9011 | 29.9017 | 41.6887   | 198.8077 |
+| 0.0557        | 11.97 | 172  | 3.3826          | 42.3089 | 18.2735 | 29.0356 | 40.4154   | 270.9675 |
+| 0.0542        | 12.94 | 186  | 3.4408          | 40.7691 | 16.529  | 28.3999 | 38.9723   | 186.7308 |
+| 0.0596        | 13.98 | 201  | 3.5253          | 37.0037 | 15.9098 | 25.2808 | 35.3868   | 398.4704 |
+| 0.0385        | 14.61 | 210  | 3.4990          | 32.5815 | 14.2951 | 22.4501 | 31.2928   | 499.3107 |
 ### Framework versions