grimjim commited on
Commit
a22f9f6
1 Parent(s): 6d17631

Update README.md

Browse files

Added model from which GGUFs were directed to metadata

Files changed (1) hide show
  1. README.md +56 -55
README.md CHANGED
@@ -1,55 +1,56 @@
1
- ---
2
- base_model:
3
- - princeton-nlp/gemma-2-9b-it-SimPO
4
- - HODACHI/EZO-Common-9B-gemma-2-it
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
- license: gemma
10
- pipeline_tag: text-generation
11
- ---
12
- # Kitsunebi-v1-Gemma2-8k-9B-GGUF
13
-
14
- This repo contains select GGUF quants of a merge of pre-trained Gemma 2 9B Instruct language models created using [mergekit](https://github.com/cg123/mergekit).
15
-
16
- None of the components of this merge were trained for roleplay nor intended for it. Despite this, the resulting model can be used effectively for that function. The virtue of this model lies in its coherence, as opposed to textual richness.
17
-
18
- This project utilizes HODACHI/EZO-Common-9B-gemma-2-it, a model based on gemma-2 and fine-tuned by Axcxept co., ltd. Its primary goal was to perform well in Japanese language tasks. Model training leveraged context-based synthesized instruction pre-training data for supervised multitask pre-training [(abstract)](https://arxiv.org/abs/2406.14491).
19
-
20
- We also used princeton-nlp/gemma-2-9b-it-SimPO, a demonstration of Simple Preference Optimization [(abstract)][https://arxiv.org/abs/2405.14734].
21
-
22
- ## Merge Details
23
- ### Merge Method
24
-
25
- This model was merged using the SLERP merge method.
26
-
27
- ### Models Merged
28
-
29
- The following models were included in the merge:
30
- * [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co/princeton-nlp/gemma-2-9b-it-SimPO)
31
- * [HODACHI/EZO-Common-9B-gemma-2-it](https://huggingface.co/HODACHI/EZO-Common-9B-gemma-2-it)
32
-
33
- ### Configuration
34
-
35
- The following YAML configuration was used to produce this model:
36
-
37
- ```yaml
38
- slices:
39
- - sources:
40
- - model: princeton-nlp/gemma-2-9b-it-SimPO
41
- layer_range: [0, 42]
42
- - model: HODACHI/EZO-Common-9B-gemma-2-it
43
- layer_range: [0, 42]
44
- merge_method: slerp
45
- base_model: HODACHI/EZO-Common-9B-gemma-2-it
46
- parameters:
47
- t:
48
- - filter: self_attn
49
- value: [0, 0.5, 0.3, 0.7, 1]
50
- - filter: mlp
51
- value: [1, 0.5, 0.7, 0.3, 0]
52
- - value: 0.5
53
- dtype: bfloat16
54
-
55
- ```
 
 
1
+ ---
2
+ base_model:
3
+ - grimjim/Kitsunebi-v1-Gemma2-8k-9B
4
+ - princeton-nlp/gemma-2-9b-it-SimPO
5
+ - HODACHI/EZO-Common-9B-gemma-2-it
6
+ library_name: transformers
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+ license: gemma
11
+ pipeline_tag: text-generation
12
+ ---
13
+ # Kitsunebi-v1-Gemma2-8k-9B-GGUF
14
+
15
+ This repo contains select GGUF quants of a merge of pre-trained Gemma 2 9B Instruct language models created using [mergekit](https://github.com/cg123/mergekit).
16
+
17
+ None of the components of this merge were trained for roleplay nor intended for it. Despite this, the resulting model can be used effectively for that function. The virtue of this model lies in its coherence, as opposed to textual richness.
18
+
19
+ This project utilizes HODACHI/EZO-Common-9B-gemma-2-it, a model based on gemma-2 and fine-tuned by Axcxept co., ltd. Its primary goal was to perform well in Japanese language tasks. Model training leveraged context-based synthesized instruction pre-training data for supervised multitask pre-training [(abstract)](https://arxiv.org/abs/2406.14491).
20
+
21
+ We also used princeton-nlp/gemma-2-9b-it-SimPO, a demonstration of Simple Preference Optimization [(abstract)][https://arxiv.org/abs/2405.14734].
22
+
23
+ ## Merge Details
24
+ ### Merge Method
25
+
26
+ This model was merged using the SLERP merge method.
27
+
28
+ ### Models Merged
29
+
30
+ The following models were included in the merge:
31
+ * [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co/princeton-nlp/gemma-2-9b-it-SimPO)
32
+ * [HODACHI/EZO-Common-9B-gemma-2-it](https://huggingface.co/HODACHI/EZO-Common-9B-gemma-2-it)
33
+
34
+ ### Configuration
35
+
36
+ The following YAML configuration was used to produce this model:
37
+
38
+ ```yaml
39
+ slices:
40
+ - sources:
41
+ - model: princeton-nlp/gemma-2-9b-it-SimPO
42
+ layer_range: [0, 42]
43
+ - model: HODACHI/EZO-Common-9B-gemma-2-it
44
+ layer_range: [0, 42]
45
+ merge_method: slerp
46
+ base_model: HODACHI/EZO-Common-9B-gemma-2-it
47
+ parameters:
48
+ t:
49
+ - filter: self_attn
50
+ value: [0, 0.5, 0.3, 0.7, 1]
51
+ - filter: mlp
52
+ value: [1, 0.5, 0.7, 0.3, 0]
53
+ - value: 0.5
54
+ dtype: bfloat16
55
+
56
+ ```