ParmyJack commited on
Commit
3b6cc57
1 Parent(s): 9298eb6

add model card

Browse files
Files changed (1) hide show
  1. README.md +117 -0
README.md CHANGED
@@ -1,3 +1,120 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: transformers
3
+ tags:
4
+ - code
5
+ - chemistry
6
+ - medical
7
+ - quantized
8
+ - 4-bit
9
+ - AWQ
10
+ - text-generation
11
+ - autotrain_compatible
12
+ - endpoints_compatible
13
+ - chatml
14
  license: apache-2.0
15
+ datasets:
16
+ - Locutusque/hyperion-v2.0
17
+ language:
18
+ - en
19
+ model_creator: Locutusque
20
+ model_name: Hyperion-2.0-Mistral-7B
21
+ model_type: mistral
22
+ pipeline_tag: text-generation
23
+ inference: false
24
+ prompt_template: '<|im_start|>system
25
+
26
+ {system_message}<|im_end|>
27
+
28
+ <|im_start|>user
29
+
30
+ {prompt}<|im_end|>
31
+
32
+ <|im_start|>assistant
33
+
34
+ '
35
+ quantized_by: Suparious
36
  ---
37
+ # Locutusque/Hyperion-2.0-Mistral-7B AWQ
38
+
39
+ **UPLOAD IN PROGRESS**
40
+
41
+ - Model creator: [Locutusque](https://huggingface.co/Locutusque)
42
+ - Original model: [Hyperion-2.0-Mistral-7B](https://huggingface.co/Locutusque/Hyperion-2.0-Mistral-7B)
43
+
44
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6437292ecd93f4c9a34b0d47/9BU30Mh9bOkO2HRBDF8EE.png)
45
+
46
+ ## Model Summary
47
+
48
+ `Locutusque/Hyperion-2.0-Mistral-7B` is a state-of-the-art language model fine-tuned on the Hyperion-v2.0 dataset for advanced reasoning across scientific domains. This model is designed to handle complex inquiries and instructions, leveraging the diverse and rich information contained in the Hyperion dataset. Its primary use cases include but are not limited to complex question answering, conversational understanding, code generation, medical text comprehension, mathematical reasoning, and logical reasoning.
49
+
50
+ ## How to use
51
+
52
+ ### Install the necessary packages
53
+
54
+ ```bash
55
+ pip install --upgrade autoawq autoawq-kernels
56
+ ```
57
+
58
+ ### Example Python code
59
+
60
+ ```python
61
+ from awq import AutoAWQForCausalLM
62
+ from transformers import AutoTokenizer, TextStreamer
63
+
64
+ model_path = "solidrust/Hyperion-2.0-Mistral-7B-AWQ"
65
+ system_message = "You are Hyperion, incarnated as a powerful AI."
66
+
67
+ # Load model
68
+ model = AutoAWQForCausalLM.from_quantized(model_path,
69
+ fuse_layers=True)
70
+ tokenizer = AutoTokenizer.from_pretrained(model_path,
71
+ trust_remote_code=True)
72
+ streamer = TextStreamer(tokenizer,
73
+ skip_prompt=True,
74
+ skip_special_tokens=True)
75
+
76
+ # Convert prompt to tokens
77
+ prompt_template = """\
78
+ <|im_start|>system
79
+ {system_message}<|im_end|>
80
+ <|im_start|>user
81
+ {prompt}<|im_end|>
82
+ <|im_start|>assistant"""
83
+
84
+ prompt = "You're standing on the surface of the Earth. "\
85
+ "You walk one mile south, one mile west and one mile north. "\
86
+ "You end up exactly where you started. Where are you?"
87
+
88
+ tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
89
+ return_tensors='pt').input_ids.cuda()
90
+
91
+ # Generate output
92
+ generation_output = model.generate(tokens,
93
+ streamer=streamer,
94
+ max_new_tokens=512)
95
+
96
+ ```
97
+
98
+ ### About AWQ
99
+
100
+ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
101
+
102
+ AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
103
+
104
+ It is supported by:
105
+
106
+ - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
107
+ - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
108
+ - [Hugging Face Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference)
109
+ - [Transformers](https://huggingface.co/docs/transformers) version 4.35.0 and later, from any code or client that supports Transformers
110
+ - [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) - for use from Python code
111
+
112
+ ## Prompt template: ChatML
113
+
114
+ ```plaintext
115
+ <|im_start|>system
116
+ {system_message}<|im_end|>
117
+ <|im_start|>user
118
+ {prompt}<|im_end|>
119
+ <|im_start|>assistant
120
+ ```