tannedbum commited on
Commit
070bbbc
1 Parent(s): 3015886

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -52
README.md CHANGED
@@ -1,52 +1,76 @@
1
- ---
2
- base_model:
3
- - princeton-nlp/gemma-2-9b-it-SimPO
4
- - TheDrummer/Gemmasutra-9B-v1
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
-
10
- ---
11
- # Ellaria-9B
12
-
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
-
15
- ## Merge Details
16
- ### Merge Method
17
-
18
- This model was merged using the SLERP merge method.
19
-
20
- ### Models Merged
21
-
22
- The following models were included in the merge:
23
- * [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co/princeton-nlp/gemma-2-9b-it-SimPO)
24
- * [TheDrummer/Gemmasutra-9B-v1](https://huggingface.co/TheDrummer/Gemmasutra-9B-v1)
25
-
26
- ### Configuration
27
-
28
- The following YAML configuration was used to produce this model:
29
-
30
- ```yaml
31
- slices:
32
- - sources:
33
- - model: TheDrummer/Gemmasutra-9B-v1
34
- layer_range: [0, 42]
35
- - model: princeton-nlp/gemma-2-9b-it-SimPO
36
- layer_range: [0, 42]
37
- merge_method: slerp
38
- base_model: TheDrummer/Gemmasutra-9B-v1
39
- parameters:
40
- t:
41
- - filter: self_attn
42
- value: [0.2, 0.4, 0.6, 0.2, 0.4]
43
- - filter: mlp
44
- value: [0.8, 0.6, 0.4, 0.8, 0.6]
45
- - value: 0.4
46
- dtype: bfloat16
47
-
48
-
49
-
50
-
51
-
52
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - princeton-nlp/gemma-2-9b-it-SimPO
4
+ - TheDrummer/Gemmasutra-9B-v1
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ - roleplay
10
+ - sillytavern
11
+ - gemma2
12
+ - not-for-all-audiences
13
+ license: cc-by-nc-4.0
14
+ language:
15
+ - en
16
+ ---
17
+
18
+ ## SillyTavern
19
+
20
+ ## Text Completion presets
21
+ ```
22
+ temp 0.9
23
+ top_k 30
24
+ top_p 0.75
25
+ min_p 0.2
26
+ rep_pen 1.1
27
+ smooth_factor 0.25
28
+ smooth_curve 1
29
+ ```
30
+ ## Advanced Formatting
31
+
32
+ [Instruct preset by Virt-io](https://huggingface.co/Virt-io/SillyTavern-Presets/tree/main/Prompts/LLAMA-3/v1.9)
33
+
34
+ Target length (tokens): 192
35
+ Instruct Mode: Enabled
36
+
37
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
38
+
39
+ ## Merge Details
40
+ ### Merge Method
41
+
42
+ This model was merged using the SLERP merge method.
43
+
44
+ ### Models Merged
45
+
46
+ The following models were included in the merge:
47
+ * [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co/princeton-nlp/gemma-2-9b-it-SimPO)
48
+ * [TheDrummer/Gemmasutra-9B-v1](https://huggingface.co/TheDrummer/Gemmasutra-9B-v1)
49
+
50
+ ### Configuration
51
+
52
+ The following YAML configuration was used to produce this model:
53
+
54
+ ```yaml
55
+ slices:
56
+ - sources:
57
+ - model: TheDrummer/Gemmasutra-9B-v1
58
+ layer_range: [0, 42]
59
+ - model: princeton-nlp/gemma-2-9b-it-SimPO
60
+ layer_range: [0, 42]
61
+ merge_method: slerp
62
+ base_model: TheDrummer/Gemmasutra-9B-v1
63
+ parameters:
64
+ t:
65
+ - filter: self_attn
66
+ value: [0.2, 0.4, 0.6, 0.2, 0.4]
67
+ - filter: mlp
68
+ value: [0.8, 0.6, 0.4, 0.8, 0.6]
69
+ - value: 0.4
70
+ dtype: bfloat16
71
+
72
+
73
+
74
+
75
+
76
+ ```