osanseviero commited on
Commit
8b7e646
1 Parent(s): 71788ee

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Mistral-7B-Instruct-v0.2
4
+ - mistralai/Mistral-7B-Instruct-v0.1
5
+ tags:
6
+ - mergekit
7
+ - merge
8
+ - moe
9
+ ---
10
+ # Mistral Instruct MoE experimental
11
+
12
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit) using the `mixtral` branch.
13
+
14
+ **This is an experimental model and has nothing to do with Mixtral. Mixtral is not a merge of models per se, but a transformer with MoE layers learned during training**
15
+
16
+ This uses a random gate, so I expect not great results. We'll see!
17
+
18
+ ## Merge Details
19
+
20
+ ### Merge Method
21
+
22
+ This model was merged using the MoE merge method.
23
+
24
+ ### Models Merged
25
+
26
+ The following models were included in the merge:
27
+ * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
28
+ * [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
29
+
30
+ ### Configuration
31
+
32
+ The following YAML configuration was used to produce this model:
33
+
34
+ ```yaml
35
+ base_model: mistralai/Mistral-7B-Instruct-v0.2
36
+ gate_mode: random
37
+ dtype: bfloat16
38
+ experts:
39
+ - source_model: mistralai/Mistral-7B-Instruct-v0.2
40
+ positive_prompts: [""]
41
+ - source_model: mistralai/Mistral-7B-Instruct-v0.1
42
+ positive_prompts: [""]
43
+ ```