RichardErkhov commited on
Commit
d264777
1 Parent(s): 6f01dd5

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +176 -0
README.md ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ DolphinStar-12.5B - GGUF
11
+ - Model creator: https://huggingface.co/Noodlz/
12
+ - Original model: https://huggingface.co/Noodlz/DolphinStar-12.5B/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [DolphinStar-12.5B.Q2_K.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q2_K.gguf) | Q2_K | 4.33GB |
18
+ | [DolphinStar-12.5B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.IQ3_XS.gguf) | IQ3_XS | 4.81GB |
19
+ | [DolphinStar-12.5B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.IQ3_S.gguf) | IQ3_S | 5.07GB |
20
+ | [DolphinStar-12.5B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q3_K_S.gguf) | Q3_K_S | 5.04GB |
21
+ | [DolphinStar-12.5B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.IQ3_M.gguf) | IQ3_M | 5.24GB |
22
+ | [DolphinStar-12.5B.Q3_K.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q3_K.gguf) | Q3_K | 5.62GB |
23
+ | [DolphinStar-12.5B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q3_K_M.gguf) | Q3_K_M | 5.62GB |
24
+ | [DolphinStar-12.5B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q3_K_L.gguf) | Q3_K_L | 6.11GB |
25
+ | [DolphinStar-12.5B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.IQ4_XS.gguf) | IQ4_XS | 6.3GB |
26
+ | [DolphinStar-12.5B.Q4_0.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q4_0.gguf) | Q4_0 | 6.57GB |
27
+ | [DolphinStar-12.5B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.IQ4_NL.gguf) | IQ4_NL | 6.64GB |
28
+ | [DolphinStar-12.5B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q4_K_S.gguf) | Q4_K_S | 6.62GB |
29
+ | [DolphinStar-12.5B.Q4_K.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q4_K.gguf) | Q4_K | 6.99GB |
30
+ | [DolphinStar-12.5B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q4_K_M.gguf) | Q4_K_M | 6.99GB |
31
+ | [DolphinStar-12.5B.Q4_1.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q4_1.gguf) | Q4_1 | 7.29GB |
32
+ | [DolphinStar-12.5B.Q5_0.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q5_0.gguf) | Q5_0 | 8.01GB |
33
+ | [DolphinStar-12.5B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q5_K_S.gguf) | Q5_K_S | 8.01GB |
34
+ | [DolphinStar-12.5B.Q5_K.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q5_K.gguf) | Q5_K | 8.22GB |
35
+ | [DolphinStar-12.5B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q5_K_M.gguf) | Q5_K_M | 8.22GB |
36
+ | [DolphinStar-12.5B.Q5_1.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q5_1.gguf) | Q5_1 | 8.73GB |
37
+ | [DolphinStar-12.5B.Q6_K.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q6_K.gguf) | Q6_K | 9.53GB |
38
+ | [DolphinStar-12.5B.Q8_0.gguf](https://huggingface.co/RichardErkhov/Noodlz_-_DolphinStar-12.5B-gguf/blob/main/DolphinStar-12.5B.Q8_0.gguf) | Q8_0 | 12.35GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ license: apache-2.0
46
+ ---
47
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63cf23cffbd0cc580bc65c73/QDvxvuS3M7oHv7JI5d1ke.png)
48
+
49
+ Custom Model "Dolphin2Star1" Merged by Noodlz.
50
+ 12.5B linear merged from the uncensored mistral 7B v0.2 as the base, with the fine tunes of StarlingLM 7B Beta that's originally mistral 7B v0.1
51
+
52
+ have fun =)
53
+
54
+
55
+
56
+ [EDIT] - preset wise it seems like it likes the "ChatML" format.
57
+ [EDIT 2] - Usage Notes - model is sorta picky with the batch size and prompt preset/template. (maybe because merge of ChatML and OpenChat models)
58
+
59
+ My current recommended setting & findings
60
+ - Using LM Studio - use the default preset. GPU acceleration to max. prompt eval size to 1024, context length to 32768. this yields me decent, coherant results. ChatML works too but occasionall spits up odd texts after a couple of turns.
61
+ - Using Oobabooga (Windows PC) - runs well using run-in-4bit along with use_flash_attention_2. default presets and everything works just fine.
62
+ - Using OobaBooga (Mac) - [investigating]
63
+
64
+
65
+
66
+
67
+ ## Instructions Template:
68
+ ```
69
+ {% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{{ '<s>' }}{% for message in messages %}{{'<|im_start|>' + message['role'] + '
70
+ ' + message['content'] + '<|im_end|>' + '
71
+ '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
72
+ ' }}{% endif %}
73
+ ```
74
+
75
+
76
+ ## Chat Template:
77
+ ```
78
+ {%- for message in messages %}
79
+ {%- if message['role'] == 'system' -%}
80
+ {%- if message['content'] -%}
81
+ {{- message['content'] + '\n\n' -}}
82
+ {%- endif -%}
83
+ {%- if user_bio -%}
84
+ {{- user_bio + '\n\n' -}}
85
+ {%- endif -%}
86
+ {%- else -%}
87
+ {%- if message['role'] == 'user' -%}
88
+ {{- name1 + ': ' + message['content'] + '\n'-}}
89
+ {%- else -%}
90
+ {{- name2 + ': ' + message['content'] + '\n' -}}
91
+ {%- endif -%}
92
+ {%- endif -%}
93
+ {%- endfor -%}
94
+ ```
95
+
96
+
97
+
98
+
99
+ ---
100
+ license: apache-2.0
101
+ ---
102
+
103
+
104
+ ---
105
+ base_model:
106
+ - cognitivecomputations/dolphin-2.8-mistral-7b-v02
107
+ - NexusFlow/Starling-LM-7B-beta
108
+ library_name: transformers
109
+ tags:
110
+ - mergekit
111
+ - merge
112
+
113
+ ---
114
+ # output_folder
115
+
116
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
117
+
118
+ ## Merge Details
119
+ ### Merge Method
120
+
121
+ This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
122
+
123
+ ### Models Merged
124
+
125
+ The following models were included in the merge:
126
+ * [cognitivecomputations/dolphin-2.8-mistral-7b-v02](https://huggingface.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02)
127
+ * [NexusFlow/Starling-LM-7B-beta](https://huggingface.co/NexusFlow/Starling-LM-7B-beta)
128
+
129
+ ### Configuration
130
+
131
+ The following YAML configuration was used to produce this model:
132
+
133
+ ```yaml
134
+ merge_method: linear
135
+ parameters:
136
+ weight: 1.0
137
+ slices:
138
+ - sources:
139
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
140
+ layer_range: [0,1]
141
+ - model: NexusFlow/Starling-LM-7B-beta
142
+ layer_range: [0,1]
143
+ parameters:
144
+ weight: 0
145
+ - sources:
146
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
147
+ layer_range: [1,8]
148
+ - sources:
149
+ - model: NexusFlow/Starling-LM-7B-beta
150
+ layer_range: [4,12]
151
+ - sources:
152
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
153
+ layer_range: [8,16]
154
+ - sources:
155
+ - model: NexusFlow/Starling-LM-7B-beta
156
+ layer_range: [12,20]
157
+ - sources:
158
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
159
+ layer_range: [16,24]
160
+ - sources:
161
+ - model: NexusFlow/Starling-LM-7B-beta
162
+ layer_range: [20,28]
163
+ - sources:
164
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
165
+ layer_range: [24,31]
166
+ - sources:
167
+ - model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
168
+ layer_range: [31,32]
169
+ - model: NexusFlow/Starling-LM-7B-beta
170
+ layer_range: [31,32]
171
+ parameters:
172
+ weight: 0
173
+ dtype: float16
174
+ tokenizer_source: model:cognitivecomputations/dolphin-2.8-mistral-7b-v02
175
+ ```
176
+