grimjim commited on
Commit
023765f
1 Parent(s): c375fda

Initial release

Browse files
.gitattributes CHANGED
@@ -4,6 +4,7 @@
4
  *.bz2 filter=lfs diff=lfs merge=lfs -text
5
  *.ckpt filter=lfs diff=lfs merge=lfs -text
6
  *.ftz filter=lfs diff=lfs merge=lfs -text
 
7
  *.gz filter=lfs diff=lfs merge=lfs -text
8
  *.h5 filter=lfs diff=lfs merge=lfs -text
9
  *.joblib filter=lfs diff=lfs merge=lfs -text
 
4
  *.bz2 filter=lfs diff=lfs merge=lfs -text
5
  *.ckpt filter=lfs diff=lfs merge=lfs -text
6
  *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gguf filter=lfs diff=lfs merge=lfs -text
8
  *.gz filter=lfs diff=lfs merge=lfs -text
9
  *.h5 filter=lfs diff=lfs merge=lfs -text
10
  *.joblib filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,48 @@
1
  ---
 
 
 
 
 
 
 
2
  license: cc-by-nc-4.0
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - grimjim/kukulemon-32K-7B
4
+ - grimjim/rogue-enchantress-32k-7B
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
  license: cc-by-nc-4.0
10
+ pipeline_tag: text-generation
11
  ---
12
+ # kukulemon-v3-soul_mix-32k-7B
13
+
14
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
+
16
+ We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
17
+
18
+ - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B)
19
+ - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF)
20
+
21
+ ## Merge Details
22
+ ### Merge Method
23
+
24
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
25
+
26
+ ### Models Merged
27
+
28
+ The following model was included in the merge:
29
+ * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
30
+
31
+ ### Configuration
32
+
33
+ The following YAML configuration was used to produce this model:
34
+
35
+ ```yaml
36
+ base_model: grimjim/kukulemon-32K-7B
37
+ dtype: bfloat16
38
+ merge_method: task_arithmetic
39
+ slices:
40
+ - sources:
41
+ - layer_range: [0, 32]
42
+ model: grimjim/kukulemon-32K-7B
43
+ - layer_range: [0, 32]
44
+ model: grimjim/rogue-enchantress-32k-7B
45
+ parameters:
46
+ weight: 10e-5
47
+
48
+ ```
kukulemon-v3-soul_mix-32k-7B.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0e2dc06a0f84ae5fe4ffe5a524df1d94a237f9f772b22a68408990e06e90d38
3
+ size 4368439072
kukulemon-v3-soul_mix-32k-7B.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6728b94bd5df709d990b6f9222fac1d92b1873e39e4c9a2e925892497e428e0
3
+ size 5131409184
kukulemon-v3-soul_mix-32k-7B.Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfe71c6c9e0d208b5c276764d66e6f4c412706afff7a4ff611664b069fe571dd
3
+ size 5942064928
kukulemon-v3-soul_mix-32k-7B.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:657ec991b22bc90bdae3b676f336ee9b2be66fef202b8efcffb6f397d6fa4ec0
3
+ size 7695857440