yangapku commited on
Commit
2fbf72d
1 Parent(s): e4299d8

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: tongyi-qianwen
4
+ license_link: https://huggingface.co/Qwen/Qwen2-Math-72B-Instruct/blob/main/LICENSE
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - chat
10
+ ---
11
+
12
+ # Qwen2-Math
13
+
14
+ ## Introduction
15
+
16
+ Over the past year, we have dedicated significant effort to researching and enhancing the reasoning capabilities of large language models, with a particular focus on their ability to solve arithmetic and mathematical problems. Today, we are delighted to introduce a serise of math-specific large language models of our Qwen2 series, Qwen2-Math and Qwen2-Math-Instruct-1.5B/7B/72B. Qwen2-Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o). We hope that Qwen2-Math can contribute to the scientific community for solving advanced mathematical problems that require complex, multi-step logical reasoning.
17
+
18
+
19
+ ## Model Details
20
+
21
+
22
+ For more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen2-math/) and [GitHub repo](https://github.com/QwenLM/Qwen2-Math).
23
+
24
+
25
+ ## Requirements
26
+ * `transformers>=4.40.0` for Qwen2-Math models. The latest version is recommended.
27
+
28
+ > [!Warning]
29
+ > <div align="center">
30
+ > <b>
31
+ > 🚨 This is a must because `transformers` integrated Qwen2 codes since `4.37.0`.
32
+ > </b>
33
+ > </div>
34
+
35
+ For requirements on GPU memory and the respective throughput, see similar results of Qwen2 [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).
36
+
37
+ ## Quick Start
38
+
39
+ > [!Important]
40
+ >
41
+ > **Qwen2-Math-72B-Instruct** is an instruction model for chatting;
42
+ >
43
+ > **Qwen2-Math-72B** is a base model typically used for completion and few-shot inference, serving as a better starting point for fine-tuning.
44
+ >
45
+
46
+ ### 🤗 Hugging Face Transformers
47
+
48
+ Qwen2-Math can be deployed and infered in the same way as [Qwen2](https://github.com/QwenLM/Qwen2). Here we show a code snippet to show you how to use the chat model with `transformers`:
49
+
50
+ ```python
51
+ from transformers import AutoModelForCausalLM, AutoTokenizer
52
+
53
+ model_name = "Qwen/Qwen2-Math-72B-Instruct"
54
+ device = "cuda" # the device to load the model onto
55
+
56
+ model = AutoModelForCausalLM.from_pretrained(
57
+ model_name,
58
+ torch_dtype="auto",
59
+ device_map="auto"
60
+ )
61
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
62
+
63
+ prompt = "Carlos is planting a lemon tree. The tree will cost $90 to plant. Each year it will grow 7 lemons, which he can sell for $1.5 each. It costs $3 a year to water and feed the tree. How many years will it take before he starts earning money on the lemon tree?"
64
+ messages = [
65
+ {"role": "system", "content": "You are a helpful assistant."},
66
+ {"role": "user", "content": prompt}
67
+ ]
68
+ text = tokenizer.apply_chat_template(
69
+ messages,
70
+ tokenize=False,
71
+ add_generation_prompt=True
72
+ )
73
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
74
+
75
+ generated_ids = model.generate(
76
+ **model_inputs,
77
+ max_new_tokens=512
78
+ )
79
+ generated_ids = [
80
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
81
+ ]
82
+
83
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
84
+ ```
85
+
86
+ ### 🤖 ModelScope
87
+ We strongly advise users especially those in mainland China to use ModelScope. `snapshot_download` can help you solve issues concerning downloading checkpoints.
88
+
89
+
90
+ ## Citation
91
+
92
+ If you find our work helpful, feel free to give us a cite.
93
+
94
+ ```
95
+ WIP
96
+ ```