End of training

Browse files

Files changed (4) hide show

README.md +25 -25
model.safetensors +1 -1
runs/Jan13_20-45-16_3eda5e562e43/events.out.tfevents.1705178717.3eda5e562e43.2889.0 +2 -2
runs/Jan13_20-45-16_3eda5e562e43/events.out.tfevents.1705181178.3eda5e562e43.2889.1 +3 -0

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2414
-- Rouge: {'rouge1': 38.3588, 'rouge2': 17.983, 'rougeL': 20.1917, 'rougeLsum': 20.1917}
-- Bert Score: 0.8806
-- Bleurt 20: -0.7794
-- Gen Len: 13.44
 ## Model description
@@ -52,26 +52,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge                                                                           | Bert Score | Bleurt 20 | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------:|:----------:|:---------:|:-------:|
-| 2.7663        | 1.0   | 186  | 2.4069          | {'rouge1': 43.4548, 'rouge2': 17.3297, 'rougeL': 18.9728, 'rougeLsum': 18.9728} | 0.874      | -0.8387   | 14.275  |
-| 2.4668        | 2.0   | 372  | 2.3255          | {'rouge1': 42.9892, 'rouge2': 18.518, 'rougeL': 19.7631, 'rougeLsum': 19.7631}  | 0.8763     | -0.8091   | 13.965  |
-| 2.2692        | 3.0   | 558  | 2.2633          | {'rouge1': 36.8257, 'rouge2': 16.1751, 'rougeL': 17.9916, 'rougeLsum': 17.9916} | 0.8744     | -0.8312   | 12.955  |
-| 2.2018        | 4.0   | 744  | 2.2481          | {'rouge1': 40.4112, 'rouge2': 18.1938, 'rougeL': 20.0606, 'rougeLsum': 20.0606} | 0.877      | -0.7846   | 14.04   |
-| 2.1736        | 5.0   | 930  | 2.2243          | {'rouge1': 39.2656, 'rouge2': 18.4718, 'rougeL': 19.5926, 'rougeLsum': 19.5926} | 0.8786     | -0.7865   | 13.31   |
-| 2.0189        | 6.0   | 1116 | 2.2220          | {'rouge1': 38.1992, 'rouge2': 18.0936, 'rougeL': 18.6278, 'rougeLsum': 18.6278} | 0.877      | -0.8295   | 13.3    |
-| 1.9425        | 7.0   | 1302 | 2.2103          | {'rouge1': 38.9165, 'rouge2': 18.0013, 'rougeL': 19.2571, 'rougeLsum': 19.2571} | 0.8779     | -0.7923   | 13.445  |
-| 1.9192        | 8.0   | 1488 | 2.2060          | {'rouge1': 37.6615, 'rouge2': 18.1423, 'rougeL': 19.3882, 'rougeLsum': 19.3882} | 0.8773     | -0.814    | 13.135  |
-| 1.8502        | 9.0   | 1674 | 2.1948          | {'rouge1': 37.595, 'rouge2': 17.5944, 'rougeL': 19.4897, 'rougeLsum': 19.4897}  | 0.8809     | -0.7914   | 13.15   |
-| 1.8201        | 10.0  | 1860 | 2.1995          | {'rouge1': 38.7935, 'rouge2': 19.2667, 'rougeL': 20.5059, 'rougeLsum': 20.5059} | 0.8809     | -0.7765   | 13.36   |
-| 1.7472        | 11.0  | 2046 | 2.2036          | {'rouge1': 37.4728, 'rouge2': 17.5974, 'rougeL': 19.5534, 'rougeLsum': 19.5534} | 0.8797     | -0.7943   | 13.245  |
-| 1.772         | 12.0  | 2232 | 2.2050          | {'rouge1': 37.6136, 'rouge2': 17.442, 'rougeL': 20.122, 'rougeLsum': 20.122}    | 0.881      | -0.7765   | 13.35   |
-| 1.7273        | 13.0  | 2418 | 2.2153          | {'rouge1': 37.2238, 'rouge2': 16.6237, 'rougeL': 19.4117, 'rougeLsum': 19.4117} | 0.8789     | -0.7929   | 13.325  |
-| 1.6854        | 14.0  | 2604 | 2.2243          | {'rouge1': 38.1249, 'rouge2': 18.0241, 'rougeL': 20.485, 'rougeLsum': 20.485}   | 0.8822     | -0.778    | 13.315  |
-| 1.6598        | 15.0  | 2790 | 2.2299          | {'rouge1': 37.3743, 'rouge2': 17.3192, 'rougeL': 19.9239, 'rougeLsum': 19.9239} | 0.8795     | -0.7805   | 13.275  |
-| 1.63          | 16.0  | 2976 | 2.2286          | {'rouge1': 38.6731, 'rouge2': 18.2088, 'rougeL': 20.2535, 'rougeLsum': 20.2535} | 0.8801     | -0.7882   | 13.415  |
-| 1.6654        | 17.0  | 3162 | 2.2355          | {'rouge1': 38.0295, 'rouge2': 17.6256, 'rougeL': 19.9215, 'rougeLsum': 19.9215} | 0.8799     | -0.7894   | 13.34   |
-| 1.6443        | 18.0  | 3348 | 2.2404          | {'rouge1': 38.3122, 'rouge2': 17.5836, 'rougeL': 19.8706, 'rougeLsum': 19.8706} | 0.8801     | -0.7799   | 13.45   |
-| 1.6083        | 19.0  | 3534 | 2.2399          | {'rouge1': 38.1749, 'rouge2': 17.4993, 'rougeL': 20.0054, 'rougeLsum': 20.0054} | 0.8801     | -0.7772   | 13.435  |
-| 1.5953        | 20.0  | 3720 | 2.2414          | {'rouge1': 38.3588, 'rouge2': 17.983, 'rougeL': 20.1917, 'rougeLsum': 20.1917}  | 0.8806     | -0.7794   | 13.44   |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.2431
+- Rouge: {'rouge1': 39.1164, 'rouge2': 19.0784, 'rougeL': 20.2856, 'rougeLsum': 20.2856}
+- Bert Score: 0.8802
+- Bleurt 20: -0.7688
+- Gen Len: 13.545
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rouge                                                                           | Bert Score | Bleurt 20 | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------:|:----------:|:---------:|:-------:|
+| 2.6921        | 1.0   | 172  | 2.4379          | {'rouge1': 43.7984, 'rouge2': 17.2952, 'rougeL': 18.5604, 'rougeLsum': 18.5604} | 0.869      | -0.8739   | 14.84   |
+| 2.5119        | 2.0   | 344  | 2.3282          | {'rouge1': 41.5219, 'rouge2': 17.4612, 'rougeL': 19.5103, 'rougeLsum': 19.5103} | 0.8749     | -0.8329   | 13.7    |
+| 2.3033        | 3.0   | 516  | 2.2821          | {'rouge1': 41.0636, 'rouge2': 18.2347, 'rougeL': 19.8704, 'rougeLsum': 19.8704} | 0.878      | -0.8268   | 13.75   |
+| 2.2139        | 4.0   | 688  | 2.2404          | {'rouge1': 39.9679, 'rouge2': 18.8795, 'rougeL': 19.7032, 'rougeLsum': 19.7032} | 0.8796     | -0.8035   | 13.305  |
+| 2.0835        | 5.0   | 860  | 2.2446          | {'rouge1': 41.8958, 'rouge2': 18.439, 'rougeL': 19.2982, 'rougeLsum': 19.2982}  | 0.877      | -0.7963   | 14.34   |
+| 2.0379        | 6.0   | 1032 | 2.2233          | {'rouge1': 40.9703, 'rouge2': 19.7574, 'rougeL': 19.9387, 'rougeLsum': 19.9387} | 0.8793     | -0.7805   | 13.625  |
+| 1.959         | 7.0   | 1204 | 2.2073          | {'rouge1': 39.2194, 'rouge2': 18.9553, 'rougeL': 19.7847, 'rougeLsum': 19.7847} | 0.8787     | -0.8045   | 13.365  |
+| 1.9177        | 8.0   | 1376 | 2.2146          | {'rouge1': 40.8391, 'rouge2': 19.5219, 'rougeL': 20.2602, 'rougeLsum': 20.2602} | 0.8781     | -0.7974   | 13.865  |
+| 1.8749        | 9.0   | 1548 | 2.2071          | {'rouge1': 40.9497, 'rouge2': 19.9867, 'rougeL': 20.5682, 'rougeLsum': 20.5682} | 0.8808     | -0.7812   | 13.68   |
+| 1.8112        | 10.0  | 1720 | 2.2045          | {'rouge1': 36.465, 'rouge2': 16.4287, 'rougeL': 19.1978, 'rougeLsum': 19.1978}  | 0.8772     | -0.8384   | 13.295  |
+| 1.7475        | 11.0  | 1892 | 2.2210          | {'rouge1': 39.4889, 'rouge2': 19.1309, 'rougeL': 19.879, 'rougeLsum': 19.879}   | 0.8785     | -0.8074   | 13.585  |
+| 1.7384        | 12.0  | 2064 | 2.2269          | {'rouge1': 38.2904, 'rouge2': 18.2873, 'rougeL': 19.4418, 'rougeLsum': 19.4418} | 0.8789     | -0.7984   | 13.42   |
+| 1.6849        | 13.0  | 2236 | 2.2261          | {'rouge1': 37.6283, 'rouge2': 17.6979, 'rougeL': 19.584, 'rougeLsum': 19.584}   | 0.878      | -0.7885   | 13.445  |
+| 1.6531        | 14.0  | 2408 | 2.2186          | {'rouge1': 38.7975, 'rouge2': 19.0939, 'rougeL': 20.7873, 'rougeLsum': 20.7873} | 0.8806     | -0.783    | 13.445  |
+| 1.663         | 15.0  | 2580 | 2.2245          | {'rouge1': 38.9159, 'rouge2': 19.153, 'rougeL': 20.5232, 'rougeLsum': 20.5232}  | 0.8811     | -0.7514   | 13.59   |
+| 1.6036        | 16.0  | 2752 | 2.2430          | {'rouge1': 37.6184, 'rouge2': 17.6773, 'rougeL': 19.2693, 'rougeLsum': 19.2693} | 0.8771     | -0.7992   | 13.6    |
+| 1.6333        | 17.0  | 2924 | 2.2418          | {'rouge1': 38.1301, 'rouge2': 18.4061, 'rougeL': 20.1355, 'rougeLsum': 20.1355} | 0.879      | -0.7845   | 13.49   |
+| 1.6322        | 18.0  | 3096 | 2.2421          | {'rouge1': 38.0746, 'rouge2': 18.2039, 'rougeL': 19.7404, 'rougeLsum': 19.7404} | 0.8789     | -0.7892   | 13.41   |
+| 1.5982        | 19.0  | 3268 | 2.2411          | {'rouge1': 39.1375, 'rouge2': 19.1696, 'rougeL': 20.2695, 'rougeLsum': 20.2695} | 0.8802     | -0.7713   | 13.465  |
+| 1.593         | 20.0  | 3440 | 2.2431          | {'rouge1': 39.1164, 'rouge2': 19.0784, 'rougeL': 20.2856, 'rougeLsum': 20.2856} | 0.8802     | -0.7688   | 13.545  |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fc1afdd2e3152523e27b2a60ce576d9bad9fa3a3e5012584c797f783e2abb513
 size 307867048

 version https://git-lfs.github.com/spec/v1
+oid sha256:1e72d525aabc520c410fe9deddbfbacd532bf4bdec8a6b04750635c0995f2d48
 size 307867048

runs/Jan13_20-45-16_3eda5e562e43/events.out.tfevents.1705178717.3eda5e562e43.2889.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:87981cb80dcd861da5ad580e5f56a071b13f60e6deea7cf54c91232f65b884ed
-size 18882

 version https://git-lfs.github.com/spec/v1
+oid sha256:fba5ee9170cec9d8a57ccf0e711d9bff7d5c1be4a7e5764a04c26ee1f869273c
+size 21308

runs/Jan13_20-45-16_3eda5e562e43/events.out.tfevents.1705181178.3eda5e562e43.2889.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d4d30d6661f51a644927eb317ad0fec3a698cc35aaa2081910af2e38e50ed648
+size 517