End of training

Browse files

Files changed (15) hide show

README.md +43 -43
config.json +1 -1
final_checkpoint/config.json +1 -1
final_checkpoint/generation_config.json +1 -1
final_checkpoint/model-00001-of-00004.safetensors +1 -1
final_checkpoint/model-00002-of-00004.safetensors +1 -1
final_checkpoint/model-00003-of-00004.safetensors +1 -1
final_checkpoint/model-00004-of-00004.safetensors +1 -1
generation_config.json +1 -1
model-00001-of-00004.safetensors +1 -1
model-00002-of-00004.safetensors +1 -1
model-00003-of-00004.safetensors +1 -1
model-00004-of-00004.safetensors +1 -1
tokenizer.json +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.4527
 ## Model description
@@ -51,51 +51,51 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 2.7452        | 0.3333  | 25   | 2.7748          |
-| 2.7844        | 0.6667  | 50   | 2.7704          |
-| 2.7915        | 1.0     | 75   | 2.7596          |
-| 2.7945        | 1.3333  | 100  | 2.7379          |
-| 2.6611        | 1.6667  | 125  | 2.7124          |
-| 2.5528        | 2.0     | 150  | 2.6876          |
-| 2.6652        | 2.3333  | 175  | 2.6628          |
-| 2.6808        | 2.6667  | 200  | 2.6394          |
-| 2.668         | 3.0     | 225  | 2.6175          |
-| 2.5973        | 3.3333  | 250  | 2.5970          |
-| 2.4943        | 3.6667  | 275  | 2.5785          |
-| 2.5433        | 4.0     | 300  | 2.5620          |
-| 2.6383        | 4.3333  | 325  | 2.5468          |
-| 2.5221        | 4.6667  | 350  | 2.5333          |
-| 2.5698        | 5.0     | 375  | 2.5210          |
-| 2.5026        | 5.3333  | 400  | 2.5108          |
-| 2.5267        | 5.6667  | 425  | 2.5004          |
-| 2.4484        | 6.0     | 450  | 2.4920          |
-| 2.4735        | 6.3333  | 475  | 2.4844          |
-| 2.3763        | 6.6667  | 500  | 2.4780          |
-| 2.5461        | 7.0     | 525  | 2.4729          |
-| 2.5406        | 7.3333  | 550  | 2.4691          |
-| 2.4936        | 7.6667  | 575  | 2.4645          |
-| 2.4328        | 8.0     | 600  | 2.4615          |
-| 2.4954        | 8.3333  | 625  | 2.4590          |
-| 2.4458        | 8.6667  | 650  | 2.4564          |
-| 2.5661        | 9.0     | 675  | 2.4550          |
-| 2.4158        | 9.3333  | 700  | 2.4542          |
-| 2.4964        | 9.6667  | 725  | 2.4537          |
-| 2.5488        | 10.0    | 750  | 2.4530          |
-| 2.4364        | 10.3333 | 775  | 2.4530          |
-| 2.3929        | 10.6667 | 800  | 2.4520          |
-| 2.536         | 11.0    | 825  | 2.4528          |
-| 2.5173        | 11.3333 | 850  | 2.4526          |
-| 2.4415        | 11.6667 | 875  | 2.4524          |
-| 2.5111        | 12.0    | 900  | 2.4522          |
-| 2.4223        | 12.3333 | 925  | 2.4528          |
-| 2.4031        | 12.6667 | 950  | 2.4527          |
-| 2.4848        | 13.0    | 975  | 2.4527          |
-| 2.4349        | 13.3333 | 1000 | 2.4527          |
 ### Framework versions
-- Transformers 4.41.1
 - Pytorch 2.0.0+cu117
-- Datasets 2.19.1
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6055
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 2.4485        | 0.3333  | 25   | 2.4666          |
+| 2.4645        | 0.6667  | 50   | 2.4522          |
+| 2.452         | 1.0     | 75   | 2.4164          |
+| 2.391         | 1.3333  | 100  | 2.3529          |
+| 2.2816        | 1.6667  | 125  | 2.2866          |
+| 2.175         | 2.0     | 150  | 2.2255          |
+| 2.2168        | 2.3333  | 175  | 2.1683          |
+| 2.1574        | 2.6667  | 200  | 2.1166          |
+| 2.1107        | 3.0     | 225  | 2.0679          |
+| 2.0126        | 3.3333  | 250  | 2.0229          |
+| 1.9353        | 3.6667  | 275  | 1.9810          |
+| 1.9552        | 4.0     | 300  | 1.9445          |
+| 1.9759        | 4.3333  | 325  | 1.9100          |
+| 1.8721        | 4.6667  | 350  | 1.8773          |
+| 1.8928        | 5.0     | 375  | 1.8491          |
+| 1.8331        | 5.3333  | 400  | 1.8236          |
+| 1.8221        | 5.6667  | 425  | 1.7980          |
+| 1.7615        | 6.0     | 450  | 1.7762          |
+| 1.7701        | 6.3333  | 475  | 1.7562          |
+| 1.7034        | 6.6667  | 500  | 1.7327          |
+| 1.7471        | 7.0     | 525  | 1.7064          |
+| 1.7317        | 7.3333  | 550  | 1.6831          |
+| 1.6897        | 7.6667  | 575  | 1.6645          |
+| 1.6452        | 8.0     | 600  | 1.6476          |
+| 1.6675        | 8.3333  | 625  | 1.6327          |
+| 1.569         | 8.6667  | 650  | 1.6238          |
+| 1.705         | 9.0     | 675  | 1.6163          |
+| 1.6025        | 9.3333  | 700  | 1.6121          |
+| 1.6224        | 9.6667  | 725  | 1.6083          |
+| 1.6976        | 10.0    | 750  | 1.6074          |
+| 1.6031        | 10.3333 | 775  | 1.6059          |
+| 1.5703        | 10.6667 | 800  | 1.6046          |
+| 1.6563        | 11.0    | 825  | 1.6055          |
+| 1.6464        | 11.3333 | 850  | 1.6059          |
+| 1.6075        | 11.6667 | 875  | 1.6055          |
+| 1.6453        | 12.0    | 900  | 1.6057          |
+| 1.5754        | 12.3333 | 925  | 1.6054          |
+| 1.5962        | 12.6667 | 950  | 1.6055          |
+| 1.6333        | 13.0    | 975  | 1.6055          |
+| 1.6086        | 13.3333 | 1000 | 1.6055          |
 ### Framework versions
+- Transformers 4.41.2
 - Pytorch 2.0.0+cu117
+- Datasets 2.19.2
 - Tokenizers 0.19.1

config.json CHANGED Viewed

@@ -23,7 +23,7 @@
   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
-  "transformers_version": "4.41.1",
   "use_cache": false,
   "vocab_size": 128256
 }

   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
+  "transformers_version": "4.41.2",
   "use_cache": false,
   "vocab_size": 128256
 }

final_checkpoint/config.json CHANGED Viewed

@@ -23,7 +23,7 @@
   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
-  "transformers_version": "4.41.1",
   "use_cache": false,
   "vocab_size": 128256
 }

   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float16",
+  "transformers_version": "4.41.2",
   "use_cache": false,
   "vocab_size": 128256
 }

final_checkpoint/generation_config.json CHANGED Viewed

@@ -8,5 +8,5 @@
   "max_length": 4096,
   "temperature": 0.6,
   "top_p": 0.9,
-  "transformers_version": "4.41.1"
 }

   "max_length": 4096,
   "temperature": 0.6,
   "top_p": 0.9,
+  "transformers_version": "4.41.2"
 }

final_checkpoint/model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:de02372fd73778f94cb22aaa5d36a19fc59c63df0b5e265f6a967e3ef67d53ff
 size 4976698592

 version https://git-lfs.github.com/spec/v1
+oid sha256:f0aa8f9daf2a398aaffc1fe86488894d30002a2ec16e3296a468a28aa2e4f7c4
 size 4976698592

final_checkpoint/model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:466c281810a37d3739ba16110746dbf54a611c75e04dac1b908086c3d4be1873
 size 4999802616

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a116f5f2d2e8b3db2185069bdccb1cd0931e72fd5ab848484df6c35198b4183
 size 4999802616

final_checkpoint/model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e9a7a4324ac13665afe367be2bd6c4c02698936f094124d9a3db855f71893e24
 size 4915916080

 version https://git-lfs.github.com/spec/v1
+oid sha256:1bdd653468c11abc2e5e958a30fe08553ead1585dd66d322a0b75ad5fe8906d5
 size 4915916080

final_checkpoint/model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f26db4ce2a22a214d06f7b9c0b028bea622f6325b068e71aad03e26b4a554589
 size 1168138808

 version https://git-lfs.github.com/spec/v1
+oid sha256:56b9a072fd07f9d6759e65680be2e0b27be4aad25944640931af34d88f6123c4
 size 1168138808

generation_config.json CHANGED Viewed

@@ -8,5 +8,5 @@
   "max_length": 4096,
   "temperature": 0.6,
   "top_p": 0.9,
-  "transformers_version": "4.41.1"
 }

   "max_length": 4096,
   "temperature": 0.6,
   "top_p": 0.9,
+  "transformers_version": "4.41.2"
 }

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:de02372fd73778f94cb22aaa5d36a19fc59c63df0b5e265f6a967e3ef67d53ff
 size 4976698592

 version https://git-lfs.github.com/spec/v1
+oid sha256:f0aa8f9daf2a398aaffc1fe86488894d30002a2ec16e3296a468a28aa2e4f7c4
 size 4976698592

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:466c281810a37d3739ba16110746dbf54a611c75e04dac1b908086c3d4be1873
 size 4999802616

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a116f5f2d2e8b3db2185069bdccb1cd0931e72fd5ab848484df6c35198b4183
 size 4999802616

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e9a7a4324ac13665afe367be2bd6c4c02698936f094124d9a3db855f71893e24
 size 4915916080

 version https://git-lfs.github.com/spec/v1
+oid sha256:1bdd653468c11abc2e5e958a30fe08553ead1585dd66d322a0b75ad5fe8906d5
 size 4915916080

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f26db4ce2a22a214d06f7b9c0b028bea622f6325b068e71aad03e26b4a554589
 size 1168138808

 version https://git-lfs.github.com/spec/v1
+oid sha256:56b9a072fd07f9d6759e65680be2e0b27be4aad25944640931af34d88f6123c4
 size 1168138808

tokenizer.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 100,
     "strategy": "LongestFirst",
     "stride": 0
   },

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:70aae47f07b7e57e1c7985416109771df0233f6b0c7386d8ebec5a26102ed0bf
 size 4603

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a4150ef7d0dff8eb90f52565dda56ec2a77189da81959a1199bbd9cec0632a8
 size 4603