MohammadOthman commited on
Commit
e82f87a
1 Parent(s): 4e603d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -28
README.md CHANGED
@@ -1,28 +1,40 @@
1
- # Mistral-Merge-7B-slerp
2
-
3
- ## Model Description
4
- The `Mistral-Merge-7B-slerp` is a merged model which leverages the spherical linear interpolation (SLERP) technique to blend layers from two distinct transformer-based models. This merging strategy is aimed at synthesizing a model that incorporates the robust linguistic capabilities of `OpenPipe/mistral-ft-optimized-1218` and the nuanced understanding of `mlabonne/NeuralHermes-2.5-Mistral-7B`.
5
-
6
- ## Configuration
7
- The merging process was configured to apply a SLERP method across all comparable layers of the two source models. Below is the YAML configuration used for merging:
8
-
9
- ```yaml
10
- slices:
11
- - sources:
12
- - model: OpenPipe/mistral-ft-optimized-1218
13
- layer_range: [0, 32]
14
- - model: mlabonne/NeuralHermes-2.5-Mistral-7B
15
- layer_range: [0, 32]
16
- merge_method: slerp
17
- base_model: OpenPipe/mistral-ft-optimized-1218
18
- parameters:
19
- t:
20
- - filter: self_attn
21
- value: [0, 0.5, 0.3, 0.7, 1]
22
- - filter: mlp
23
- value: [1, 0.5, 0.7, 0.3, 0]
24
- - value: 0.5
25
- dtype: bfloat16
26
- ```
27
-
28
- This configuration ensures that both self-attention and MLP (multi-layer perceptron) layers undergo interpolation with a gradient of weights to optimize the integration of features from both models.
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - mergekit
8
+ - text-generation
9
+ - merge
10
+ ---
11
+
12
+
13
+ # Mistral-Merge-7B-slerp
14
+
15
+ ## Model Description
16
+ The `Mistral-Merge-7B-slerp` is a merged model which leverages the spherical linear interpolation (SLERP) technique to blend layers from two distinct transformer-based models. This merging strategy is aimed at synthesizing a model that incorporates the robust linguistic capabilities of `OpenPipe/mistral-ft-optimized-1218` and the nuanced understanding of `mlabonne/NeuralHermes-2.5-Mistral-7B`.
17
+
18
+ ## Configuration
19
+ The merging process was configured to apply a SLERP method across all comparable layers of the two source models. Below is the YAML configuration used for merging:
20
+
21
+ ```yaml
22
+ slices:
23
+ - sources:
24
+ - model: OpenPipe/mistral-ft-optimized-1218
25
+ layer_range: [0, 32]
26
+ - model: mlabonne/NeuralHermes-2.5-Mistral-7B
27
+ layer_range: [0, 32]
28
+ merge_method: slerp
29
+ base_model: OpenPipe/mistral-ft-optimized-1218
30
+ parameters:
31
+ t:
32
+ - filter: self_attn
33
+ value: [0, 0.5, 0.3, 0.7, 1]
34
+ - filter: mlp
35
+ value: [1, 0.5, 0.7, 0.3, 0]
36
+ - value: 0.5
37
+ dtype: bfloat16
38
+ ```
39
+
40
+ This configuration ensures that both self-attention and MLP (multi-layer perceptron) layers undergo interpolation with a gradient of weights to optimize the integration of features from both models.