imdatta0
/

mistral_7b_v_Magiccoder_evol_10k_reverse

PEFT

Safetensors

unsloth

Generated from Trainer

Model card Files Files and versions Community

imdatta0 commited on Jun 11

Commit

cf56834

•

1 Parent(s): e5f22e3

End of training

Browse files

Files changed (2) hide show

README.md +42 -42
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1146
 ## Model description
@@ -37,10 +37,10 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 8
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
@@ -51,44 +51,44 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.1799        | 0.0262 | 4    | 1.1888          |
-| 1.1193        | 0.0523 | 8    | 1.1757          |
-| 1.1603        | 0.0785 | 12   | 1.1751          |
-| 1.0847        | 0.1047 | 16   | 1.1702          |
-| 1.1304        | 0.1308 | 20   | 1.1674          |
-| 1.042         | 0.1570 | 24   | 1.1582          |
-| 1.1863        | 0.1832 | 28   | 1.1633          |
-| 1.14          | 0.2093 | 32   | 1.1597          |
-| 1.0763        | 0.2355 | 36   | 1.1503          |
-| 1.135         | 0.2617 | 40   | 1.1458          |
-| 1.1623        | 0.2878 | 44   | 1.1393          |
-| 1.1173        | 0.3140 | 48   | 1.1423          |
-| 1.1283        | 0.3401 | 52   | 1.1482          |
-| 1.0967        | 0.3663 | 56   | 1.1356          |
-| 1.1131        | 0.3925 | 60   | 1.1338          |
-| 1.1613        | 0.4186 | 64   | 1.1419          |
-| 1.0548        | 0.4448 | 68   | 1.1454          |
-| 1.0629        | 0.4710 | 72   | 1.1320          |
-| 1.0679        | 0.4971 | 76   | 1.1355          |
-| 1.16          | 0.5233 | 80   | 1.1287          |
-| 1.0579        | 0.5495 | 84   | 1.1295          |
-| 1.1214        | 0.5756 | 88   | 1.1392          |
-| 1.1681        | 0.6018 | 92   | 1.1242          |
-| 1.1667        | 0.6280 | 96   | 1.1223          |
-| 1.0871        | 0.6541 | 100  | 1.1221          |
-| 1.1147        | 0.6803 | 104  | 1.1243          |
-| 1.1075        | 0.7065 | 108  | 1.1254          |
-| 0.9958        | 0.7326 | 112  | 1.1186          |
-| 1.0718        | 0.7588 | 116  | 1.1085          |
-| 1.0748        | 0.7850 | 120  | 1.1193          |
-| 1.1082        | 0.8111 | 124  | 1.1138          |
-| 1.0981        | 0.8373 | 128  | 1.1102          |
-| 1.1231        | 0.8635 | 132  | 1.1133          |
-| 1.0687        | 0.8896 | 136  | 1.1143          |
-| 1.1568        | 0.9158 | 140  | 1.1139          |
-| 1.0177        | 0.9419 | 144  | 1.1140          |
-| 1.0401        | 0.9681 | 148  | 1.1145          |
-| 1.1827        | 0.9943 | 152  | 1.1146          |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1558
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.3236        | 0.0261 | 4    | 1.3076          |
+| 1.2244        | 0.0523 | 8    | 1.2947          |
+| 1.5369        | 0.0784 | 12   | 1.4240          |
+| 5.2765        | 0.1046 | 16   | 3.1163          |
+| 3.5831        | 0.1307 | 20   | 1.7562          |
+| 1.7895        | 0.1569 | 24   | 1.7124          |
+| 1.914         | 0.1830 | 28   | 1.7797          |
+| 2.9106        | 0.2092 | 32   | 2.3285          |
+| 1.5011        | 0.2353 | 36   | 1.4598          |
+| 1.4755        | 0.2614 | 40   | 1.4380          |
+| 1.4568        | 0.2876 | 44   | 1.3801          |
+| 1.2952        | 0.3137 | 48   | 1.3155          |
+| 1.3008        | 0.3399 | 52   | 1.2782          |
+| 1.2098        | 0.3660 | 56   | 1.2382          |
+| 1.2073        | 0.3922 | 60   | 1.2299          |
+| 1.2424        | 0.4183 | 64   | 1.2237          |
+| 1.1401        | 0.4444 | 68   | 1.2220          |
+| 1.1368        | 0.4706 | 72   | 1.2071          |
+| 1.1203        | 0.4967 | 76   | 1.2119          |
+| 1.21          | 0.5229 | 80   | 1.2026          |
+| 1.12          | 0.5490 | 84   | 1.1905          |
+| 1.199         | 0.5752 | 88   | 1.1893          |
+| 1.2302        | 0.6013 | 92   | 1.1889          |
+| 1.2382        | 0.6275 | 96   | 1.1797          |
+| 1.1521        | 0.6536 | 100  | 1.1765          |
+| 1.1563        | 0.6797 | 104  | 1.1728          |
+| 1.1676        | 0.7059 | 108  | 1.1718          |
+| 1.0429        | 0.7320 | 112  | 1.1642          |
+| 1.1303        | 0.7582 | 116  | 1.1660          |
+| 1.126         | 0.7843 | 120  | 1.1641          |
+| 1.1603        | 0.8105 | 124  | 1.1598          |
+| 1.146         | 0.8366 | 128  | 1.1587          |
+| 1.1689        | 0.8627 | 132  | 1.1547          |
+| 1.1046        | 0.8889 | 136  | 1.1533          |
+| 1.201         | 0.9150 | 140  | 1.1565          |
+| 1.0665        | 0.9412 | 144  | 1.1566          |
+| 1.0795        | 0.9673 | 148  | 1.1561          |
+| 1.2229        | 0.9935 | 152  | 1.1558          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2b321b4d43638fea63f5586d8c5fb4dc44644eaa1ab8085543abe605a1e36ec2
 size 83945296

 version https://git-lfs.github.com/spec/v1
+oid sha256:551f15c5cbbc2c089df52733014336d1952b1bd2e19c97841bde2a2775dd4f62
 size 83945296