Suparious commited on
Commit
c22baee
1 Parent(s): ed109df

add model card

Browse files
Files changed (1) hide show
  1. README.md +227 -0
README.md CHANGED
@@ -1,3 +1,230 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model:
4
+ - cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
5
+ - Locutusque/Hyperion-1.5-Mistral-7B
6
+ - ibm/merlinite-7b
7
+ - autotrain_compatible
8
+ - endpoints_compatible
9
+ - text-generation-inference
10
+ - chatml
11
+ - mistral
12
+ library_name: transformers
13
+ tags:
14
+ - mergekit
15
+ - merge
16
+ - code
17
+ - quantized
18
+ - 4-bit
19
+ - AWQ
20
+ - transformers
21
+ model-index:
22
+ - name: Magic-Dolphin-7b
23
+ results:
24
+ - task:
25
+ type: text-generation
26
+ name: Text Generation
27
+ dataset:
28
+ name: AI2 Reasoning Challenge (25-Shot)
29
+ type: ai2_arc
30
+ config: ARC-Challenge
31
+ split: test
32
+ args:
33
+ num_few_shot: 25
34
+ metrics:
35
+ - type: acc_norm
36
+ value: 65.78
37
+ name: normalized accuracy
38
+ source:
39
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
40
+ name: Open LLM Leaderboard
41
+ - task:
42
+ type: text-generation
43
+ name: Text Generation
44
+ dataset:
45
+ name: HellaSwag (10-Shot)
46
+ type: hellaswag
47
+ split: validation
48
+ args:
49
+ num_few_shot: 10
50
+ metrics:
51
+ - type: acc_norm
52
+ value: 85.61
53
+ name: normalized accuracy
54
+ source:
55
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
56
+ name: Open LLM Leaderboard
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: MMLU (5-Shot)
62
+ type: cais/mmlu
63
+ config: all
64
+ split: test
65
+ args:
66
+ num_few_shot: 5
67
+ metrics:
68
+ - type: acc
69
+ value: 64.64
70
+ name: accuracy
71
+ source:
72
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
73
+ name: Open LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: TruthfulQA (0-shot)
79
+ type: truthful_qa
80
+ config: multiple_choice
81
+ split: validation
82
+ args:
83
+ num_few_shot: 0
84
+ metrics:
85
+ - type: mc2
86
+ value: 58.01
87
+ source:
88
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
89
+ name: Open LLM Leaderboard
90
+ - task:
91
+ type: text-generation
92
+ name: Text Generation
93
+ dataset:
94
+ name: Winogrande (5-shot)
95
+ type: winogrande
96
+ config: winogrande_xl
97
+ split: validation
98
+ args:
99
+ num_few_shot: 5
100
+ metrics:
101
+ - type: acc
102
+ value: 79.64
103
+ name: accuracy
104
+ source:
105
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
106
+ name: Open LLM Leaderboard
107
+ - task:
108
+ type: text-generation
109
+ name: Text Generation
110
+ dataset:
111
+ name: GSM8k (5-shot)
112
+ type: gsm8k
113
+ config: main
114
+ split: test
115
+ args:
116
+ num_few_shot: 5
117
+ metrics:
118
+ - type: acc
119
+ value: 51.18
120
+ name: accuracy
121
+ source:
122
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=InferenceIllusionist/Magic-Dolphin-7b
123
+ name: Open LLM Leaderboard
124
+ language:
125
+ - en
126
+ model_creator: InferenceIllusionist
127
+ model_name: Magic-Dolphin-7b
128
+ model_type: mistral
129
+ pipeline_tag: text-generation
130
+ inference: false
131
+ prompt_template: '<|im_start|>system
132
+
133
+ {system_message}<|im_end|>
134
+
135
+ <|im_start|>user
136
+
137
+ {prompt}<|im_end|>
138
+
139
+ <|im_start|>assistant
140
+
141
+ '
142
+ quantized_by: Suparious
143
  ---
144
+ # InferenceIllusionist/Magic-Dolphin-7b AWQ
145
+
146
+ - Model creator: [InferenceIllusionist](https://huggingface.co/InferenceIllusionist)
147
+ - Original model: [Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b)
148
+
149
+ <img src="https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b/resolve/main/magic-dolphin.jfif" width="500"/>
150
+
151
+ ## Model Summary
152
+
153
+ A linear merge of:
154
+ - [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)
155
+ - [Locutusque/Hyperion-1.5-Mistral-7B](https://huggingface.co/Locutusque/Hyperion-1.5-Mistral-7B)
156
+ - [ibm/merlinite-7b](https://huggingface.co/ibm/merlinite-7b)
157
+
158
+ These three models showed excellent acumen in technical topics so I wanted to see how they would behave together in a merge. Several different ratios were tested before this release, in the end a higher weighting for merlinite-7b helped smooth out some edges. This model is a test of how LAB tuning is impacted by merges with models leveraging DPO.
159
+
160
+ ## How to use
161
+
162
+ ### Install the necessary packages
163
+
164
+ ```bash
165
+ pip install --upgrade autoawq autoawq-kernels
166
+ ```
167
+
168
+ ### Example Python code
169
+
170
+ ```python
171
+ from awq import AutoAWQForCausalLM
172
+ from transformers import AutoTokenizer, TextStreamer
173
+
174
+ model_path = "solidrust/Magic-Dolphin-7b-AWQ"
175
+ system_message = "You are Dolphin, incarnated as a powerful AI."
176
+
177
+ # Load model
178
+ model = AutoAWQForCausalLM.from_quantized(model_path,
179
+ fuse_layers=True)
180
+ tokenizer = AutoTokenizer.from_pretrained(model_path,
181
+ trust_remote_code=True)
182
+ streamer = TextStreamer(tokenizer,
183
+ skip_prompt=True,
184
+ skip_special_tokens=True)
185
+
186
+ # Convert prompt to tokens
187
+ prompt_template = """\
188
+ <|im_start|>system
189
+ {system_message}<|im_end|>
190
+ <|im_start|>user
191
+ {prompt}<|im_end|>
192
+ <|im_start|>assistant"""
193
+
194
+ prompt = "You're standing on the surface of the Earth. "\
195
+ "You walk one mile south, one mile west and one mile north. "\
196
+ "You end up exactly where you started. Where are you?"
197
+
198
+ tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
199
+ return_tensors='pt').input_ids.cuda()
200
+
201
+ # Generate output
202
+ generation_output = model.generate(tokens,
203
+ streamer=streamer,
204
+ max_new_tokens=512)
205
+
206
+ ```
207
+
208
+ ### About AWQ
209
+
210
+ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
211
+
212
+ AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
213
+
214
+ It is supported by:
215
+
216
+ - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
217
+ - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
218
+ - [Hugging Face Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference)
219
+ - [Transformers](https://huggingface.co/docs/transformers) version 4.35.0 and later, from any code or client that supports Transformers
220
+ - [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) - for use from Python code
221
+
222
+ ## Prompt template: ChatML
223
+
224
+ ```plaintext
225
+ <|im_start|>system
226
+ {system_message}<|im_end|>
227
+ <|im_start|>user
228
+ {prompt}<|im_end|>
229
+ <|im_start|>assistant
230
+ ```