End of training

Browse files

Files changed (6) hide show

README.md +29 -2
adapter_model.bin +1 -1
adapter_model.safetensors +1 -1
all_results.json +6 -6
train_results.json +6 -6
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -4,6 +4,8 @@ tags:
 - generated_from_trainer
 datasets:
 - truthful_qa
 model-index:
 - name: llama_mc_finetune
   results: []
@@ -15,6 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
 # llama_mc_finetune
 This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the truthful_qa dataset.
 ## Model description
@@ -34,17 +39,39 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- training_steps: 20
 - mixed_precision_training: Native AMP
 ### Training results
 ### Framework versions

 - generated_from_trainer
 datasets:
 - truthful_qa
+metrics:
+- accuracy
 model-index:
 - name: llama_mc_finetune
   results: []
 # llama_mc_finetune
 This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the truthful_qa dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.3667
+- Accuracy: 0.8476
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 6
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.6688        | 1.0   | 109  | 2.2295          | 0.2988   |
+| 1.044         | 2.0   | 218  | 2.1568          | 0.3354   |
+| 0.3249        | 3.0   | 327  | 0.7980          | 0.7195   |
+| 1.3202        | 4.0   | 436  | 2.2679          | 0.1768   |
+| 1.325         | 5.0   | 545  | 0.9487          | 0.8232   |
+| 0.0001        | 6.0   | 654  | 1.3517          | 0.8171   |
+| 1.8235        | 7.0   | 763  | 1.5762          | 0.7622   |
+| 0.0001        | 8.0   | 872  | 1.5415          | 0.8415   |
+| 0.0           | 9.0   | 981  | 1.1195          | 0.8110   |
+| 0.0           | 10.0  | 1090 | 1.2257          | 0.8232   |
+| 0.0           | 11.0  | 1199 | 1.3680          | 0.8171   |
+| 0.0           | 12.0  | 1308 | 1.3485          | 0.8171   |
+| 0.0           | 13.0  | 1417 | 1.3482          | 0.8171   |
+| 0.0           | 14.0  | 1526 | 1.3481          | 0.8171   |
+| 0.0           | 15.0  | 1635 | 1.3628          | 0.8415   |
+| 0.0           | 16.0  | 1744 | 1.3643          | 0.8476   |
+| 0.0           | 17.0  | 1853 | 1.3649          | 0.8476   |
+| 0.0           | 18.0  | 1962 | 1.3659          | 0.8476   |
+| 0.0           | 19.0  | 2071 | 1.3663          | 0.8476   |
+| 0.0           | 20.0  | 2180 | 1.3667          | 0.8476   |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:09c2b7002ff788d88d15bd99682d6628ff38f37610acdc8fc2a94ba41d5fb965
 size 160283150

 version https://git-lfs.github.com/spec/v1
+oid sha256:adf38df745a81dbd66e2264d38cbd4996b6feaacf9dcd10498954713816601cf
 size 160283150

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0b2017190d296cba81ef5d64174acf4db1f01102ab62e220c70981fc90aff577
 size 160180976

 version https://git-lfs.github.com/spec/v1
+oid sha256:9017deec726e179a130f88d40afe680f37577ce20e4c7a24f46fce9e70d5243e
 size 160180976

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 0.12,
-    "total_flos": 825752514723840.0,
-    "train_loss": 0.38747100830078124,
-    "train_runtime": 30.2879,
-    "train_samples_per_second": 2.641,
-    "train_steps_per_second": 0.66
 }

 {
+    "epoch": 20.0,
+    "total_flos": 1.2663415269359616e+17,
+    "train_loss": 0.4154004427643197,
+    "train_runtime": 4412.8344,
+    "train_samples_per_second": 2.96,
+    "train_steps_per_second": 0.494
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 0.12,
-    "total_flos": 825752514723840.0,
-    "train_loss": 0.38747100830078124,
-    "train_runtime": 30.2879,
-    "train_samples_per_second": 2.641,
-    "train_steps_per_second": 0.66
 }

 {
+    "epoch": 20.0,
+    "total_flos": 1.2663415269359616e+17,
+    "train_loss": 0.4154004427643197,
+    "train_runtime": 4412.8344,
+    "train_samples_per_second": 2.96,
+    "train_steps_per_second": 0.494
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff