grimjim commited on
Commit
004bf50
1 Parent(s): 023765f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -48
README.md CHANGED
@@ -1,48 +1,44 @@
1
- ---
2
- base_model:
3
- - grimjim/kukulemon-32K-7B
4
- - grimjim/rogue-enchantress-32k-7B
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
- license: cc-by-nc-4.0
10
- pipeline_tag: text-generation
11
- ---
12
- # kukulemon-v3-soul_mix-32k-7B
13
-
14
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
-
16
- We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
17
-
18
- - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B)
19
- - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF)
20
-
21
- ## Merge Details
22
- ### Merge Method
23
-
24
- This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
25
-
26
- ### Models Merged
27
-
28
- The following model was included in the merge:
29
- * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- base_model: grimjim/kukulemon-32K-7B
37
- dtype: bfloat16
38
- merge_method: task_arithmetic
39
- slices:
40
- - sources:
41
- - layer_range: [0, 32]
42
- model: grimjim/kukulemon-32K-7B
43
- - layer_range: [0, 32]
44
- model: grimjim/rogue-enchantress-32k-7B
45
- parameters:
46
- weight: 10e-5
47
-
48
- ```
 
1
+ ---
2
+ base_model: grimjim/kukulemon-v3-soul_mix-32k-7B
3
+ library_name: transformers
4
+ quanted_by: grimjim
5
+ license: cc-by-nc-4.0
6
+ pipeline_tag: text-generation
7
+ ---
8
+ # kukulemon-v3-soul_mix-32k-7B
9
+
10
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
11
+
12
+ We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
13
+
14
+ - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B)
15
+ - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF)
16
+
17
+ ## Merge Details
18
+ ### Merge Method
19
+
20
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
21
+
22
+ ### Models Merged
23
+
24
+ The following model was included in the merge:
25
+ * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
26
+
27
+ ### Configuration
28
+
29
+ The following YAML configuration was used to produce this model:
30
+
31
+ ```yaml
32
+ base_model: grimjim/kukulemon-32K-7B
33
+ dtype: bfloat16
34
+ merge_method: task_arithmetic
35
+ slices:
36
+ - sources:
37
+ - layer_range: [0, 32]
38
+ model: grimjim/kukulemon-32K-7B
39
+ - layer_range: [0, 32]
40
+ model: grimjim/rogue-enchantress-32k-7B
41
+ parameters:
42
+ weight: 10e-5
43
+
44
+ ```