End of training

Browse files

Files changed (6) hide show

README.md +59 -31
config.json +1 -1
model.safetensors +1 -1
runs/Apr06_19-56-08_df7953592bde/events.out.tfevents.1712433370.df7953592bde.168.7 +3 -0
tokenizer_config.json +7 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 license: apache-2.0
-base_model: google/flan-t5-base
 tags:
 - generated_from_trainer
 model-index:
@@ -13,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
 # ingredient_prune
-This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2171
 ## Model description
@@ -40,40 +40,68 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 38.6247       | 0.18  | 10   | 28.2426         |
-| 26.1854       | 0.36  | 20   | 20.2002         |
-| 19.6623       | 0.55  | 30   | 13.6317         |
-| 13.5288       | 0.73  | 40   | 6.1384          |
-| 7.0646        | 0.91  | 50   | 4.3907          |
-| 4.6726        | 1.09  | 60   | 4.1267          |
-| 4.2044        | 1.27  | 70   | 3.8144          |
-| 3.9212        | 1.45  | 80   | 3.4817          |
-| 3.6409        | 1.64  | 90   | 2.9574          |
-| 3.2497        | 1.82  | 100  | 2.0126          |
-| 2.8668        | 2.0   | 110  | 1.5548          |
-| 2.5591        | 2.18  | 120  | 1.3483          |
-| 2.2817        | 2.36  | 130  | 0.9596          |
-| 2.0322        | 2.55  | 140  | 0.7737          |
-| 1.7896        | 2.73  | 150  | 0.6418          |
-| 1.5978        | 2.91  | 160  | 0.5350          |
-| 1.4263        | 3.09  | 170  | 0.4166          |
-| 1.3053        | 3.27  | 180  | 0.3914          |
-| 1.1636        | 3.45  | 190  | 0.3543          |
-| 1.0639        | 3.64  | 200  | 0.3054          |
-| 1.0036        | 3.82  | 210  | 0.2860          |
-| 0.9076        | 4.0   | 220  | 0.2683          |
-| 0.8769        | 4.18  | 230  | 0.2524          |
-| 0.8282        | 4.36  | 240  | 0.2333          |
-| 0.8092        | 4.55  | 250  | 0.2233          |
-| 0.771         | 4.73  | 260  | 0.2198          |
-| 0.7718        | 4.91  | 270  | 0.2171          |
 ### Framework versions

 ---
 license: apache-2.0
+base_model: SpamAcc/ingredient_prune
 tags:
 - generated_from_trainer
 model-index:
 # ingredient_prune
+This model is a fine-tuned version of [SpamAcc/ingredient_prune](https://huggingface.co/SpamAcc/ingredient_prune) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0432
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 100
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.312         | 1.82  | 100  | 0.0295          |
+| 0.0533        | 3.64  | 200  | 0.0149          |
+| 0.0247        | 5.45  | 300  | 0.0136          |
+| 0.0149        | 7.27  | 400  | 0.0124          |
+| 0.0114        | 9.09  | 500  | 0.0127          |
+| 0.0086        | 10.91 | 600  | 0.0127          |
+| 0.0075        | 12.73 | 700  | 0.0145          |
+| 0.0061        | 14.55 | 800  | 0.0151          |
+| 0.0058        | 16.36 | 900  | 0.0161          |
+| 0.0044        | 18.18 | 1000 | 0.0169          |
+| 0.0039        | 20.0  | 1100 | 0.0199          |
+| 0.0044        | 21.82 | 1200 | 0.0181          |
+| 0.0035        | 23.64 | 1300 | 0.0230          |
+| 0.0039        | 25.45 | 1400 | 0.0226          |
+| 0.0028        | 27.27 | 1500 | 0.0234          |
+| 0.0026        | 29.09 | 1600 | 0.0272          |
+| 0.0023        | 30.91 | 1700 | 0.0261          |
+| 0.0028        | 32.73 | 1800 | 0.0254          |
+| 0.0018        | 34.55 | 1900 | 0.0268          |
+| 0.0022        | 36.36 | 2000 | 0.0303          |
+| 0.002         | 38.18 | 2100 | 0.0286          |
+| 0.0018        | 40.0  | 2200 | 0.0299          |
+| 0.0024        | 41.82 | 2300 | 0.0322          |
+| 0.0019        | 43.64 | 2400 | 0.0328          |
+| 0.0015        | 45.45 | 2500 | 0.0310          |
+| 0.002         | 47.27 | 2600 | 0.0352          |
+| 0.0015        | 49.09 | 2700 | 0.0361          |
+| 0.0013        | 50.91 | 2800 | 0.0358          |
+| 0.0011        | 52.73 | 2900 | 0.0368          |
+| 0.0017        | 54.55 | 3000 | 0.0387          |
+| 0.0012        | 56.36 | 3100 | 0.0384          |
+| 0.0011        | 58.18 | 3200 | 0.0402          |
+| 0.0016        | 60.0  | 3300 | 0.0394          |
+| 0.0012        | 61.82 | 3400 | 0.0403          |
+| 0.0013        | 63.64 | 3500 | 0.0392          |
+| 0.0011        | 65.45 | 3600 | 0.0413          |
+| 0.0015        | 67.27 | 3700 | 0.0400          |
+| 0.0021        | 69.09 | 3800 | 0.0412          |
+| 0.0009        | 70.91 | 3900 | 0.0410          |
+| 0.0013        | 72.73 | 4000 | 0.0419          |
+| 0.0009        | 74.55 | 4100 | 0.0415          |
+| 0.0011        | 76.36 | 4200 | 0.0418          |
+| 0.0008        | 78.18 | 4300 | 0.0422          |
+| 0.0013        | 80.0  | 4400 | 0.0434          |
+| 0.0011        | 81.82 | 4500 | 0.0436          |
+| 0.0011        | 83.64 | 4600 | 0.0434          |
+| 0.0008        | 85.45 | 4700 | 0.0434          |
+| 0.0009        | 87.27 | 4800 | 0.0436          |
+| 0.0006        | 89.09 | 4900 | 0.0442          |
+| 0.0009        | 90.91 | 5000 | 0.0436          |
+| 0.001         | 92.73 | 5100 | 0.0434          |
+| 0.0008        | 94.55 | 5200 | 0.0433          |
+| 0.0013        | 96.36 | 5300 | 0.0434          |
+| 0.001         | 98.18 | 5400 | 0.0433          |
+| 0.0008        | 100.0 | 5500 | 0.0432          |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "google/flan-t5-base",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

 {
+  "_name_or_path": "SpamAcc/ingredient_prune",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2cf5897e03e0b92c1aaed543969de4de07cc6c4393b6a29244296c58f62dc21c
 size 990345064

 version https://git-lfs.github.com/spec/v1
+oid sha256:a48c17f9338e9a40b3bf27cce0bc38c84ea418e590bfc95237ceeda033e26faa
 size 990345064

runs/Apr06_19-56-08_df7953592bde/events.out.tfevents.1712433370.df7953592bde.168.7 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a6fc1d9fe5b5b8e898bd913e1faa6d44d7e7023633bc771fbfd64e041d0ce5a5
+size 32505

tokenizer_config.json CHANGED Viewed

@@ -930,9 +930,16 @@
   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
   "extra_ids": 100,
   "model_max_length": 128,
   "pad_token": "<pad>",
   "sp_model_kwargs": {},
   "tokenizer_class": "T5Tokenizer",
   "unk_token": "<unk>"
 }

   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
   "extra_ids": 100,
+  "max_length": 128,
   "model_max_length": 128,
+  "pad_to_multiple_of": null,
   "pad_token": "<pad>",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
   "sp_model_kwargs": {},
+  "stride": 0,
   "tokenizer_class": "T5Tokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
   "unk_token": "<unk>"
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:63fda25d44118605083d817d628379fe09399080fbc23a7bd7096cdd43600383
 size 5048

 version https://git-lfs.github.com/spec/v1
+oid sha256:1e94952fd1a241da6a91d2143e1db7aca04a9dfb43c509bedad928a1c37d9c7f
 size 5048