RanchiZhao
commited on
Commit
•
1b61607
1
Parent(s):
6db86f7
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,9 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
<div align="center">
|
5 |
<img src="https://github.com/OpenBMB/MiniCPM/tree/main/assets/minicpm_logo.png" width="500em" ></img>
|
@@ -16,7 +20,7 @@ Join us in <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> a
|
|
16 |
## Introduction
|
17 |
MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.
|
18 |
|
19 |
-
Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to [Advanced Features](https://github.com/
|
20 |
|
21 |
MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.
|
22 |
|
@@ -25,18 +29,14 @@ MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B c
|
|
25 |
```python
|
26 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
27 |
import torch
|
28 |
-
|
29 |
-
path = "openbmb/MiniCPM3-4B-GPTQ-int4"
|
30 |
device = "cuda"
|
31 |
-
|
32 |
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
|
33 |
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)
|
34 |
-
|
35 |
messages = [
|
36 |
{"role": "user", "content": "推荐5个北京的景点。"},
|
37 |
]
|
38 |
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
|
39 |
-
|
40 |
model_outputs = model.generate(
|
41 |
model_inputs,
|
42 |
max_new_tokens=1024,
|
@@ -44,11 +44,9 @@ model_outputs = model.generate(
|
|
44 |
temperature=0.7,
|
45 |
repetition_penalty=1.02
|
46 |
)
|
47 |
-
|
48 |
output_token_ids = [
|
49 |
model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
|
50 |
]
|
51 |
-
|
52 |
responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
|
53 |
print(responses)
|
54 |
```
|
@@ -57,23 +55,18 @@ print(responses)
|
|
57 |
```python
|
58 |
from transformers import AutoTokenizer
|
59 |
from vllm import LLM, SamplingParams
|
60 |
-
|
61 |
-
model_name = "openbmb/MiniCPM3-4B-GPTQ-int4"
|
62 |
prompt = [{"role": "user", "content": "推荐5个北京的景点。"}]
|
63 |
-
|
64 |
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
65 |
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
|
66 |
-
|
67 |
llm = LLM(
|
68 |
model=model_name,
|
69 |
trust_remote_code=True,
|
70 |
tensor_parallel_size=1,
|
71 |
-
quantization='gptq'
|
72 |
)
|
73 |
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)
|
74 |
-
|
75 |
outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)
|
76 |
-
|
77 |
print(outputs[0].outputs[0].text)
|
78 |
```
|
79 |
|
@@ -270,7 +263,6 @@ print(outputs[0].outputs[0].text)
|
|
270 |
</tr>
|
271 |
</table>
|
272 |
|
273 |
-
|
274 |
## Statement
|
275 |
* As a language model, MiniCPM3-4B generates content by learning from a vast amount of text.
|
276 |
* However, it does not possess the ability to comprehend or express personal opinions or value judgments.
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- zh
|
5 |
+
- en
|
6 |
+
pipeline_tag: text-generation
|
7 |
---
|
8 |
<div align="center">
|
9 |
<img src="https://github.com/OpenBMB/MiniCPM/tree/main/assets/minicpm_logo.png" width="500em" ></img>
|
|
|
20 |
## Introduction
|
21 |
MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.
|
22 |
|
23 |
+
Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to [Advanced Features](https://github.com/OpenBMB/MiniCPM/tree/main?tab=readme-ov-file#%E8%BF%9B%E9%98%B6%E5%8A%9F%E8%83%BD) for usage guidelines.
|
24 |
|
25 |
MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.
|
26 |
|
|
|
29 |
```python
|
30 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
31 |
import torch
|
32 |
+
path = "openbmb/MiniCPM3-4B-GPTQ-Int4"
|
|
|
33 |
device = "cuda"
|
|
|
34 |
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
|
35 |
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)
|
|
|
36 |
messages = [
|
37 |
{"role": "user", "content": "推荐5个北京的景点。"},
|
38 |
]
|
39 |
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
|
|
|
40 |
model_outputs = model.generate(
|
41 |
model_inputs,
|
42 |
max_new_tokens=1024,
|
|
|
44 |
temperature=0.7,
|
45 |
repetition_penalty=1.02
|
46 |
)
|
|
|
47 |
output_token_ids = [
|
48 |
model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
|
49 |
]
|
|
|
50 |
responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
|
51 |
print(responses)
|
52 |
```
|
|
|
55 |
```python
|
56 |
from transformers import AutoTokenizer
|
57 |
from vllm import LLM, SamplingParams
|
58 |
+
model_name = "openbmb/MiniCPM3-4B-GPTQ-Int4"
|
|
|
59 |
prompt = [{"role": "user", "content": "推荐5个北京的景点。"}]
|
|
|
60 |
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
61 |
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
|
|
|
62 |
llm = LLM(
|
63 |
model=model_name,
|
64 |
trust_remote_code=True,
|
65 |
tensor_parallel_size=1,
|
66 |
+
quantization='gptq',
|
67 |
)
|
68 |
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)
|
|
|
69 |
outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)
|
|
|
70 |
print(outputs[0].outputs[0].text)
|
71 |
```
|
72 |
|
|
|
263 |
</tr>
|
264 |
</table>
|
265 |
|
|
|
266 |
## Statement
|
267 |
* As a language model, MiniCPM3-4B generates content by learning from a vast amount of text.
|
268 |
* However, it does not possess the ability to comprehend or express personal opinions or value judgments.
|