End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.2462
 ## Model description
@@ -45,16 +45,18 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
-- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.1109        | 3.0   | 3    | 2.8722          |
-| 0.0491        | 6.0   | 6    | 2.9571          |
-| 0.0153        | 9.0   | 9    | 3.2462          |
 ### Framework versions

 This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6079
 ## Model description
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
+- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.2365        | 1.0   | 1    | 2.6422          |
+| 0.1717        | 2.0   | 2    | 2.2893          |
+| 0.1298        | 3.0   | 3    | 1.9988          |
+| 0.0971        | 4.0   | 4    | 1.7610          |
+| 0.0673        | 5.0   | 5    | 1.6079          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -22,13 +22,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "o_proj",
-    "v_proj",
-    "up_proj",
     "down_proj",
-    "q_proj",
     "k_proj",
-    "gate_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "down_proj",
     "k_proj",
+    "q_proj",
+    "gate_proj",
+    "o_proj",
+    "up_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e6562914367ab4d707f5fbaba0c442459945c284fafe0fe3489c411cc3be553d
 size 4272785368

 version https://git-lfs.github.com/spec/v1
+oid sha256:0fb2289b4970f2987ee7d02c9217dcc1efcf6e9b408f11b17ce008fb8dfd147b
 size 4272785368

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1a778874a950e756317234ae0ce3d7fa2d9b2e61330e8ab192d56bc97cf42659
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:034fb7a7a10feafe9976b983194d753fdb18014da327b2e717d5f9e840345887
 size 4920