Update README
Browse files
README.md
CHANGED
@@ -5,21 +5,26 @@ tags:
|
|
5 |
- llama
|
6 |
---
|
7 |
|
8 |
-
#
|
9 |
|
10 |
-
OpenChat is a series of open-source language models fine-tuned on
|
11 |
|
12 |
Generic models:
|
13 |
|
14 |
-
- OpenChat: based on LLaMA-13B (
|
15 |
-
|
|
|
|
|
|
|
|
|
16 |
|
17 |
-
Code models
|
18 |
|
19 |
-
-
|
20 |
-
|
21 |
-
|
22 |
|
|
|
23 |
|
24 |
## Conversation Template
|
25 |
|
@@ -34,15 +39,13 @@ Besides base model vocabulary, an end-of-turn token `<|end_of_turn|>` is added,
|
|
34 |
tokenize("User:") + tokenize(user_question) + [eot_token_id] + tokenize("Assistant:")
|
35 |
```
|
36 |
|
37 |
-
*Hint: In BPE, `tokenize(A) + tokenize(B)` not always equals to `tokenize(A + B)`*
|
38 |
|
39 |
Following is the code for generating the conversation templates:
|
40 |
|
41 |
```python
|
42 |
@dataclass
|
43 |
class ModelConfig:
|
44 |
-
name: str
|
45 |
-
|
46 |
# Prompt
|
47 |
system: Optional[str]
|
48 |
|
@@ -51,9 +54,6 @@ class ModelConfig:
|
|
51 |
eot_token: str
|
52 |
bos_token: Optional[str] = None
|
53 |
|
54 |
-
# Tokenize
|
55 |
-
max_tokens: Optional[int] = None
|
56 |
-
|
57 |
# Get template
|
58 |
def generate_conversation_template(self, tokenize_fn, tokenize_special_fn, message_list):
|
59 |
tokens = []
|
@@ -86,19 +86,12 @@ class ModelConfig:
|
|
86 |
else:
|
87 |
assert idx == len(message_list) - 1, "Empty message for completion must be on the last."
|
88 |
|
89 |
-
# Truncate to specified tokens
|
90 |
-
if self.max_tokens:
|
91 |
-
tokens = tokens[:self.max_tokens]
|
92 |
-
masks = masks[:self.max_tokens]
|
93 |
-
|
94 |
return tokens, masks
|
95 |
|
96 |
|
97 |
MODEL_CONFIG_MAP = {
|
98 |
-
# OpenChat
|
99 |
"openchat": ModelConfig(
|
100 |
-
name="OpenChat",
|
101 |
-
|
102 |
# Prompt
|
103 |
system=None,
|
104 |
|
@@ -109,15 +102,10 @@ MODEL_CONFIG_MAP = {
|
|
109 |
ai_role="gpt",
|
110 |
eot_token="<|end_of_turn|>",
|
111 |
bos_token="<s>",
|
112 |
-
|
113 |
-
# Tokenize
|
114 |
-
max_tokens=2048
|
115 |
),
|
116 |
|
117 |
# OpenCoder / OpenCoderPlus
|
118 |
"opencoder": ModelConfig(
|
119 |
-
name="OpenCoder",
|
120 |
-
|
121 |
# Prompt
|
122 |
system=None,
|
123 |
|
@@ -128,9 +116,6 @@ MODEL_CONFIG_MAP = {
|
|
128 |
ai_role="gpt",
|
129 |
eot_token="<|end_of_turn|>",
|
130 |
bos_token=None,
|
131 |
-
|
132 |
-
# Tokenize
|
133 |
-
max_tokens=8192
|
134 |
)
|
135 |
}
|
136 |
```
|
|
|
5 |
- llama
|
6 |
---
|
7 |
|
8 |
+
# OpenChat: Less is More for Open-source Models
|
9 |
|
10 |
+
OpenChat is a series of open-source language models fine-tuned on very little diverse and high-quality multi-round conversations. The [dataset](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset) contains only ~6K GPT-4 conversations filtered from the 90K ShareGPT conversations.
|
11 |
|
12 |
Generic models:
|
13 |
|
14 |
+
- OpenChat: based on LLaMA-13B (2048 context length)
|
15 |
+
- **105.7%** of ChatGPT score on Vicuna GPT-4 evaluation
|
16 |
+
- **80.87%** Win-rate on AlpacaEval
|
17 |
+
- **🚀 Only used 6K data for finetuning!!!**
|
18 |
+
- OpenChat-8192: based on LLaMA-13B (extended to 8192 context length)
|
19 |
+
- **106.6%** of ChatGPT score on Vicuna GPT-4 evaluation
|
20 |
|
21 |
+
Code models:
|
22 |
|
23 |
+
- OpenCoderPlus: based on StarCoderPlus (native 8192 context length)
|
24 |
+
- **102.5%** of ChatGPT score on Vicuna GPT-4 evaluation
|
25 |
+
- **78.70%** Win-rate on AlpacaEval
|
26 |
|
27 |
+
**NOTE:** Please load the pretrained models using *bfloat16*
|
28 |
|
29 |
## Conversation Template
|
30 |
|
|
|
39 |
tokenize("User:") + tokenize(user_question) + [eot_token_id] + tokenize("Assistant:")
|
40 |
```
|
41 |
|
42 |
+
*Hint: In BPE, `tokenize(A) + tokenize(B)` does not always equals to `tokenize(A + B)`*
|
43 |
|
44 |
Following is the code for generating the conversation templates:
|
45 |
|
46 |
```python
|
47 |
@dataclass
|
48 |
class ModelConfig:
|
|
|
|
|
49 |
# Prompt
|
50 |
system: Optional[str]
|
51 |
|
|
|
54 |
eot_token: str
|
55 |
bos_token: Optional[str] = None
|
56 |
|
|
|
|
|
|
|
57 |
# Get template
|
58 |
def generate_conversation_template(self, tokenize_fn, tokenize_special_fn, message_list):
|
59 |
tokens = []
|
|
|
86 |
else:
|
87 |
assert idx == len(message_list) - 1, "Empty message for completion must be on the last."
|
88 |
|
|
|
|
|
|
|
|
|
|
|
89 |
return tokens, masks
|
90 |
|
91 |
|
92 |
MODEL_CONFIG_MAP = {
|
93 |
+
# OpenChat / OpenChat-8192
|
94 |
"openchat": ModelConfig(
|
|
|
|
|
95 |
# Prompt
|
96 |
system=None,
|
97 |
|
|
|
102 |
ai_role="gpt",
|
103 |
eot_token="<|end_of_turn|>",
|
104 |
bos_token="<s>",
|
|
|
|
|
|
|
105 |
),
|
106 |
|
107 |
# OpenCoder / OpenCoderPlus
|
108 |
"opencoder": ModelConfig(
|
|
|
|
|
109 |
# Prompt
|
110 |
system=None,
|
111 |
|
|
|
116 |
ai_role="gpt",
|
117 |
eot_token="<|end_of_turn|>",
|
118 |
bos_token=None,
|
|
|
|
|
|
|
119 |
)
|
120 |
}
|
121 |
```
|