End of training
Browse files- README.md +30 -1
- adapter_model.bin +2 -2
README.md
CHANGED
@@ -2,6 +2,7 @@
|
|
2 |
license: apache-2.0
|
3 |
library_name: peft
|
4 |
tags:
|
|
|
5 |
- generated_from_trainer
|
6 |
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
|
7 |
model-index:
|
@@ -107,7 +108,9 @@ fsdp_config:
|
|
107 |
|
108 |
# empower-functions-more-tools-parallel
|
109 |
|
110 |
-
This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on
|
|
|
|
|
111 |
|
112 |
## Model description
|
113 |
|
@@ -153,6 +156,32 @@ The following hyperparameters were used during training:
|
|
153 |
- lr_scheduler_warmup_steps: 10
|
154 |
- num_epochs: 4
|
155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
156 |
### Framework versions
|
157 |
|
158 |
- PEFT 0.7.0
|
|
|
2 |
license: apache-2.0
|
3 |
library_name: peft
|
4 |
tags:
|
5 |
+
- axolotl
|
6 |
- generated_from_trainer
|
7 |
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
|
8 |
model-index:
|
|
|
108 |
|
109 |
# empower-functions-more-tools-parallel
|
110 |
|
111 |
+
This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the None dataset.
|
112 |
+
It achieves the following results on the evaluation set:
|
113 |
+
- Loss: 0.0865
|
114 |
|
115 |
## Model description
|
116 |
|
|
|
156 |
- lr_scheduler_warmup_steps: 10
|
157 |
- num_epochs: 4
|
158 |
|
159 |
+
### Training results
|
160 |
+
|
161 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
162 |
+
|:-------------:|:-----:|:----:|:---------------:|
|
163 |
+
| 2.0913 | 0.0 | 1 | 2.0864 |
|
164 |
+
| 0.0992 | 0.2 | 178 | 0.1038 |
|
165 |
+
| 0.0923 | 0.4 | 356 | 0.0957 |
|
166 |
+
| 0.0847 | 0.6 | 534 | 0.0938 |
|
167 |
+
| 0.1034 | 0.8 | 712 | 0.0925 |
|
168 |
+
| 0.1062 | 1.0 | 890 | 0.0901 |
|
169 |
+
| 0.1006 | 1.19 | 1068 | 0.0894 |
|
170 |
+
| 0.084 | 1.39 | 1246 | 0.0882 |
|
171 |
+
| 0.0798 | 1.59 | 1424 | 0.0875 |
|
172 |
+
| 0.0752 | 1.79 | 1602 | 0.0849 |
|
173 |
+
| 0.0772 | 1.99 | 1780 | 0.0846 |
|
174 |
+
| 0.0824 | 2.17 | 1958 | 0.0849 |
|
175 |
+
| 0.0792 | 2.37 | 2136 | 0.0843 |
|
176 |
+
| 0.0627 | 2.57 | 2314 | 0.0837 |
|
177 |
+
| 0.0777 | 2.77 | 2492 | 0.0831 |
|
178 |
+
| 0.0636 | 2.98 | 2670 | 0.0827 |
|
179 |
+
| 0.0624 | 3.16 | 2848 | 0.0855 |
|
180 |
+
| 0.0612 | 3.36 | 3026 | 0.0861 |
|
181 |
+
| 0.0649 | 3.56 | 3204 | 0.0861 |
|
182 |
+
| 0.0641 | 3.76 | 3382 | 0.0865 |
|
183 |
+
|
184 |
+
|
185 |
### Framework versions
|
186 |
|
187 |
- PEFT 0.7.0
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f9691054ab35227e3b587fd4817e2c3d6f6699334538d689a4f94ad5fe6a8202
|
3 |
+
size 109144269
|