End of training

Browse files

Files changed (6) hide show

README.md +15 -59
config.json +1 -1
model.safetensors +1 -1
runs/Apr06_21-02-34_df7953592bde/events.out.tfevents.1712437358.df7953592bde.168.8 +3 -0
tokenizer_config.json +0 -7
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 license: apache-2.0
-base_model: SpamAcc/ingredient_prune
 tags:
 - generated_from_trainer
 model-index:
@@ -13,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
 # ingredient_prune
-This model is a fine-tuned version of [SpamAcc/ingredient_prune](https://huggingface.co/SpamAcc/ingredient_prune) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0432
 ## Model description
@@ -40,68 +40,24 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 100
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.312         | 1.82  | 100  | 0.0295          |
-| 0.0533        | 3.64  | 200  | 0.0149          |
-| 0.0247        | 5.45  | 300  | 0.0136          |
-| 0.0149        | 7.27  | 400  | 0.0124          |
-| 0.0114        | 9.09  | 500  | 0.0127          |
-| 0.0086        | 10.91 | 600  | 0.0127          |
-| 0.0075        | 12.73 | 700  | 0.0145          |
-| 0.0061        | 14.55 | 800  | 0.0151          |
-| 0.0058        | 16.36 | 900  | 0.0161          |
-| 0.0044        | 18.18 | 1000 | 0.0169          |
-| 0.0039        | 20.0  | 1100 | 0.0199          |
-| 0.0044        | 21.82 | 1200 | 0.0181          |
-| 0.0035        | 23.64 | 1300 | 0.0230          |
-| 0.0039        | 25.45 | 1400 | 0.0226          |
-| 0.0028        | 27.27 | 1500 | 0.0234          |
-| 0.0026        | 29.09 | 1600 | 0.0272          |
-| 0.0023        | 30.91 | 1700 | 0.0261          |
-| 0.0028        | 32.73 | 1800 | 0.0254          |
-| 0.0018        | 34.55 | 1900 | 0.0268          |
-| 0.0022        | 36.36 | 2000 | 0.0303          |
-| 0.002         | 38.18 | 2100 | 0.0286          |
-| 0.0018        | 40.0  | 2200 | 0.0299          |
-| 0.0024        | 41.82 | 2300 | 0.0322          |
-| 0.0019        | 43.64 | 2400 | 0.0328          |
-| 0.0015        | 45.45 | 2500 | 0.0310          |
-| 0.002         | 47.27 | 2600 | 0.0352          |
-| 0.0015        | 49.09 | 2700 | 0.0361          |
-| 0.0013        | 50.91 | 2800 | 0.0358          |
-| 0.0011        | 52.73 | 2900 | 0.0368          |
-| 0.0017        | 54.55 | 3000 | 0.0387          |
-| 0.0012        | 56.36 | 3100 | 0.0384          |
-| 0.0011        | 58.18 | 3200 | 0.0402          |
-| 0.0016        | 60.0  | 3300 | 0.0394          |
-| 0.0012        | 61.82 | 3400 | 0.0403          |
-| 0.0013        | 63.64 | 3500 | 0.0392          |
-| 0.0011        | 65.45 | 3600 | 0.0413          |
-| 0.0015        | 67.27 | 3700 | 0.0400          |
-| 0.0021        | 69.09 | 3800 | 0.0412          |
-| 0.0009        | 70.91 | 3900 | 0.0410          |
-| 0.0013        | 72.73 | 4000 | 0.0419          |
-| 0.0009        | 74.55 | 4100 | 0.0415          |
-| 0.0011        | 76.36 | 4200 | 0.0418          |
-| 0.0008        | 78.18 | 4300 | 0.0422          |
-| 0.0013        | 80.0  | 4400 | 0.0434          |
-| 0.0011        | 81.82 | 4500 | 0.0436          |
-| 0.0011        | 83.64 | 4600 | 0.0434          |
-| 0.0008        | 85.45 | 4700 | 0.0434          |
-| 0.0009        | 87.27 | 4800 | 0.0436          |
-| 0.0006        | 89.09 | 4900 | 0.0442          |
-| 0.0009        | 90.91 | 5000 | 0.0436          |
-| 0.001         | 92.73 | 5100 | 0.0434          |
-| 0.0008        | 94.55 | 5200 | 0.0433          |
-| 0.0013        | 96.36 | 5300 | 0.0434          |
-| 0.001         | 98.18 | 5400 | 0.0433          |
-| 0.0008        | 100.0 | 5500 | 0.0432          |
 ### Framework versions

 ---
 license: apache-2.0
+base_model: google/flan-t5-base
 tags:
 - generated_from_trainer
 model-index:
 # ingredient_prune
+This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0153
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 17.0417       | 1.82  | 100  | 4.0933          |
+| 2.66          | 3.64  | 200  | 0.3448          |
+| 0.4315        | 5.45  | 300  | 0.0862          |
+| 0.1           | 7.27  | 400  | 0.0291          |
+| 0.0454        | 9.09  | 500  | 0.0219          |
+| 0.0293        | 10.91 | 600  | 0.0176          |
+| 0.0212        | 12.73 | 700  | 0.0165          |
+| 0.0186        | 14.55 | 800  | 0.0155          |
+| 0.0155        | 16.36 | 900  | 0.0155          |
+| 0.0159        | 18.18 | 1000 | 0.0152          |
+| 0.0137        | 20.0  | 1100 | 0.0153          |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "SpamAcc/ingredient_prune",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

 {
+  "_name_or_path": "google/flan-t5-base",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a48c17f9338e9a40b3bf27cce0bc38c84ea418e590bfc95237ceeda033e26faa
 size 990345064

 version https://git-lfs.github.com/spec/v1
+oid sha256:9f4b8c55164dace8410732a460c81caaa18507a5b7771af315bb7f0ab3e61379
 size 990345064

runs/Apr06_21-02-34_df7953592bde/events.out.tfevents.1712437358.df7953592bde.168.8 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1204b7bf22fe1bdd01a3f65c43a0ce569825d7917a43e98e4b2e2254779473d4
+size 11263

tokenizer_config.json CHANGED Viewed

@@ -930,16 +930,9 @@
   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
   "extra_ids": 100,
-  "max_length": 128,
   "model_max_length": 128,
-  "pad_to_multiple_of": null,
   "pad_token": "<pad>",
-  "pad_token_type_id": 0,
-  "padding_side": "right",
   "sp_model_kwargs": {},
-  "stride": 0,
   "tokenizer_class": "T5Tokenizer",
-  "truncation_side": "right",
-  "truncation_strategy": "longest_first",
   "unk_token": "<unk>"
 }

   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
   "extra_ids": 100,
   "model_max_length": 128,
   "pad_token": "<pad>",
   "sp_model_kwargs": {},
   "tokenizer_class": "T5Tokenizer",
   "unk_token": "<unk>"
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1e94952fd1a241da6a91d2143e1db7aca04a9dfb43c509bedad928a1c37d9c7f
 size 5048

 version https://git-lfs.github.com/spec/v1
+oid sha256:ead7e82110958dd17cdf6f4286954c9855174d651b7e318cfc571e901758dcd6
 size 5048