Aryanne commited on
Commit
4381391
1 Parent(s): 5e54da1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - berkeley-nest/Starling-LM-7B-alpha
4
+ - NousResearch/Nous-Hermes-2-Mistral-7B-DPO
5
+ - senseable/WestLake-7B-v2
6
+ - openchat/openchat-3.5-0106
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+
12
+ ---
13
+ # merged
14
+
15
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
+
17
+ ## Merge Details
18
+ ### Merge Method
19
+
20
+ This model was merged using the task_swapping merge method using [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) as a base.
21
+
22
+ ### Models Merged
23
+
24
+ The following models were included in the merge:
25
+ * [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
26
+ * [NousResearch/Nous-Hermes-2-Mistral-7B-DPO](https://huggingface.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO)
27
+ * [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
28
+
29
+ ### Configuration
30
+
31
+ The following YAML configuration was used to produce this model:
32
+
33
+ ```yaml
34
+ base_model:
35
+ model:
36
+ path: senseable/WestLake-7B-v2
37
+ dtype: bfloat16
38
+ merge_method: task_swapping
39
+ slices:
40
+ - sources:
41
+ - layer_range: [0, 32]
42
+ model:
43
+ model:
44
+ path: berkeley-nest/Starling-LM-7B-alpha
45
+ parameters:
46
+ diagonal_offset: 2.0
47
+ weight: 0.72
48
+ - layer_range: [0, 32]
49
+ model:
50
+ model:
51
+ path: openchat/openchat-3.5-0106
52
+ parameters:
53
+ diagonal_offset: 4.0
54
+ random_mask: 0.166
55
+ random_mask_seed: 19519.0
56
+ weight: 0.4
57
+ - layer_range: [0, 32]
58
+ model:
59
+ model:
60
+ path: NousResearch/Nous-Hermes-2-Mistral-7B-DPO
61
+ parameters:
62
+ diagonal_offset: 4.0
63
+ random_mask: 0.125
64
+ random_mask_seed: 990090.0
65
+ weight: 0.666
66
+ - layer_range: [0, 32]
67
+ model:
68
+ model:
69
+ path: senseable/WestLake-7B-v2
70
+ ```