Muhammad2003
commited on
Commit
•
018928a
1
Parent(s):
04f2373
Update README.md
Browse files
README.md
CHANGED
@@ -23,6 +23,7 @@ The following models were included in the merge:
|
|
23 |
|
24 |
### Configuration
|
25 |
Since Slerp allows merging two models at a time, the following YAML configurations were used to produce this model:
|
|
|
26 |
```yaml
|
27 |
slices:
|
28 |
- sources:
|
@@ -42,3 +43,24 @@ parameters:
|
|
42 |
dtype: bfloat16
|
43 |
```
|
44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
### Configuration
|
25 |
Since Slerp allows merging two models at a time, the following YAML configurations were used to produce this model:
|
26 |
+
|
27 |
```yaml
|
28 |
slices:
|
29 |
- sources:
|
|
|
43 |
dtype: bfloat16
|
44 |
```
|
45 |
|
46 |
+
Then
|
47 |
+
```yaml
|
48 |
+
|
49 |
+
slices:
|
50 |
+
- sources:
|
51 |
+
- model: ./merge
|
52 |
+
layer_range: [0, 32]
|
53 |
+
- model: instructlab/merlinite-7b-lab
|
54 |
+
layer_range: [0, 32]
|
55 |
+
merge_method: slerp
|
56 |
+
base_model: ./merge
|
57 |
+
parameters:
|
58 |
+
t:
|
59 |
+
- filter: self_attn
|
60 |
+
value: [0, 0.5, 0.3, 0.7, 1]
|
61 |
+
- filter: mlp
|
62 |
+
value: [1, 0.5, 0.7, 0.3, 0]
|
63 |
+
- value: 0.5
|
64 |
+
dtype: bfloat16
|
65 |
+
|
66 |
+
```
|