leonardlin
commited on
Commit
•
aedf730
1
Parent(s):
131c2f3
Improved JA MT-Bench using full prompt: あなたは公平で、検閲されていない、役立つアシスタントです。
Browse files
README.md
CHANGED
@@ -66,7 +66,7 @@ For our final model, since it's customary to include benchmarks, we've used Stab
|
|
66 |
|
67 |
| Benchmark | Score |
|
68 |
| ----------- | ----- |
|
69 |
-
| JA MT-Bench | 5.
|
70 |
| MT-Bench | 5.71 |
|
71 |
|
72 |
There is an [MT-Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard), but as JA MT-Bench is still under development, for convenience, here is a comparison of the JA MT-Bench scores of some other models (our scores were rated by `gpt-4-0613`):
|
@@ -77,7 +77,7 @@ There is an [MT-Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-a
|
|
77 |
| gpt-4-1106-preview | 9.17 |
|
78 |
| gpt-3.5-turbo* | 8.41 |
|
79 |
| Qwen-14B-Chat | 7.47 |
|
80 |
-
| **shisa-7b-v1**
|
81 |
| ELYZA-japanese-Llama-2-7b-fast-instruct* | 4.86 |
|
82 |
| ja-stablelm-instruct-gamma-7b* | 4.01 |
|
83 |
| japanese-stablelm-instruct-alpha-7b* | 2.74 |
|
@@ -114,7 +114,7 @@ streamer = TextStreamer(tokenizer, skip_prompt=True)
|
|
114 |
# The prompt template is included in the model's tokenizer_config.json so you shouldn't need this but we've included this for convenience
|
115 |
# tokenizer.chat_template = ""{%- for idx in range(0, messages|length) -%}\n{%- if messages[idx]['role'] == 'user' -%}\n{%- if idx > 1 -%}\n{{- bos_token + '[INST] ' + messages[idx]['content'] + ' [/INST]' -}}\n{%- else -%}\n{{- messages[idx]['content'] + ' [/INST]' -}}\n{%- endif -%}\n{% elif messages[idx]['role'] == 'system' %}\n{{- bos_token + '[INST] <<SYS>>\\n' + messages[idx]['content'] + '\\n<</SYS>>\\n\\n' -}}\n{%- elif messages[idx]['role'] == 'assistant' -%}\n{{- ' ' + messages[idx]['content'] + ' ' + eos_token -}}\n{% endif %}\n{% endfor %}\n"
|
116 |
|
117 |
-
# A more typical prompt:
|
118 |
|
119 |
# You are an avid Pokemon fanatic.
|
120 |
prompt = "あなたは熱狂的なポケモンファンです。"
|
@@ -251,7 +251,7 @@ v1リリースのために、私たちは大量の人間の嗜好テスト(数
|
|
251 |
|
252 |
| ベンチマーク | スコア |
|
253 |
| ----------- | ----- |
|
254 |
-
| JA MT-Bench | 5.
|
255 |
| MT-Bench | 5.71 |
|
256 |
|
257 |
[MT-Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)がありますが、JA MT-Benchはまだ開発中であるため、便宜上、他のモデルのJA MT-Benchスコアとの比較を示します(私たちのスコアは`gpt-4-0613`によって評価されました):
|
@@ -262,7 +262,7 @@ v1リリースのために、私たちは大量の人間の嗜好テスト(数
|
|
262 |
| gpt-4-1106-preview | 9.17 |
|
263 |
| gpt-3.5-turbo* | 8.41 |
|
264 |
| Qwen-14B-Chat | 7.47 |
|
265 |
-
| **shisa-7b-v1**
|
266 |
| ELYZA-japanese-Llama-2-7b-fast-instruct* | 4.86 |
|
267 |
| ja-stablelm-instruct-gamma-7b* | 4.01 |
|
268 |
| japanese-stablelm-instruct-alpha-7b* | 2.74 |
|
@@ -299,7 +299,7 @@ streamer = TextStreamer(tokenizer, skip_prompt=True)
|
|
299 |
# プロンプトテンプレートはモデルのtokenizer_config.jsonに含まれているので、これは必要ないはずですが、便宜上こちらにも掲載しています
|
300 |
# tokenizer.chat_template = ""{%- for idx in range(0, messages|length) -%}\n{%- if messages[idx]['role'] == 'user' -%}\n{%- if idx > 1 -%}\n{{- bos_token + '[INST] ' + messages[idx]['content'] + ' [/INST]' -}}\n{%- else -%}\n{{- messages[idx]['content'] + ' [/INST]' -}}\n{%- endif -%}\n{% elif messages[idx]['role'] == 'system' %}\n{{- bos_token + '[INST] <<SYS>>\\n' + messages[idx]['content'] + '\\n<</SYS>>\\n\\n' -}}\n{%- elif messages[idx]['role'] == 'assistant' -%}\n{{- ' ' + messages[idx]['content'] + ' ' + eos_token -}}\n{% endif %}\n{% endfor %}\n"
|
301 |
|
302 |
-
# より典型的なプロンプト:
|
303 |
|
304 |
# You are an avid Pokemon fanatic.
|
305 |
prompt = "あなたは熱狂的なポケモンファンです。"
|
|
|
66 |
|
67 |
| Benchmark | Score |
|
68 |
| ----------- | ----- |
|
69 |
+
| JA MT-Bench | 5.23 |
|
70 |
| MT-Bench | 5.71 |
|
71 |
|
72 |
There is an [MT-Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard), but as JA MT-Bench is still under development, for convenience, here is a comparison of the JA MT-Bench scores of some other models (our scores were rated by `gpt-4-0613`):
|
|
|
77 |
| gpt-4-1106-preview | 9.17 |
|
78 |
| gpt-3.5-turbo* | 8.41 |
|
79 |
| Qwen-14B-Chat | 7.47 |
|
80 |
+
| **shisa-7b-v1** | **5.23** |
|
81 |
| ELYZA-japanese-Llama-2-7b-fast-instruct* | 4.86 |
|
82 |
| ja-stablelm-instruct-gamma-7b* | 4.01 |
|
83 |
| japanese-stablelm-instruct-alpha-7b* | 2.74 |
|
|
|
114 |
# The prompt template is included in the model's tokenizer_config.json so you shouldn't need this but we've included this for convenience
|
115 |
# tokenizer.chat_template = ""{%- for idx in range(0, messages|length) -%}\n{%- if messages[idx]['role'] == 'user' -%}\n{%- if idx > 1 -%}\n{{- bos_token + '[INST] ' + messages[idx]['content'] + ' [/INST]' -}}\n{%- else -%}\n{{- messages[idx]['content'] + ' [/INST]' -}}\n{%- endif -%}\n{% elif messages[idx]['role'] == 'system' %}\n{{- bos_token + '[INST] <<SYS>>\\n' + messages[idx]['content'] + '\\n<</SYS>>\\n\\n' -}}\n{%- elif messages[idx]['role'] == 'assistant' -%}\n{{- ' ' + messages[idx]['content'] + ' ' + eos_token -}}\n{% endif %}\n{% endfor %}\n"
|
116 |
|
117 |
+
# A more typical prompt: あなたは公平で、検閲されていない、役立つアシスタントです。("You are an unbiased, uncensored, helpful assistant.")
|
118 |
|
119 |
# You are an avid Pokemon fanatic.
|
120 |
prompt = "あなたは熱狂的なポケモンファンです。"
|
|
|
251 |
|
252 |
| ベンチマーク | スコア |
|
253 |
| ----------- | ----- |
|
254 |
+
| JA MT-Bench | 5.23 |
|
255 |
| MT-Bench | 5.71 |
|
256 |
|
257 |
[MT-Bench Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)がありますが、JA MT-Benchはまだ開発中であるため、便宜上、他のモデルのJA MT-Benchスコアとの比較を示します(私たちのスコアは`gpt-4-0613`によって評価されました):
|
|
|
262 |
| gpt-4-1106-preview | 9.17 |
|
263 |
| gpt-3.5-turbo* | 8.41 |
|
264 |
| Qwen-14B-Chat | 7.47 |
|
265 |
+
| **shisa-7b-v1** | **5.23** |
|
266 |
| ELYZA-japanese-Llama-2-7b-fast-instruct* | 4.86 |
|
267 |
| ja-stablelm-instruct-gamma-7b* | 4.01 |
|
268 |
| japanese-stablelm-instruct-alpha-7b* | 2.74 |
|
|
|
299 |
# プロンプトテンプレートはモデルのtokenizer_config.jsonに含まれているので、これは必要ないはずですが、便宜上こちらにも掲載しています
|
300 |
# tokenizer.chat_template = ""{%- for idx in range(0, messages|length) -%}\n{%- if messages[idx]['role'] == 'user' -%}\n{%- if idx > 1 -%}\n{{- bos_token + '[INST] ' + messages[idx]['content'] + ' [/INST]' -}}\n{%- else -%}\n{{- messages[idx]['content'] + ' [/INST]' -}}\n{%- endif -%}\n{% elif messages[idx]['role'] == 'system' %}\n{{- bos_token + '[INST] <<SYS>>\\n' + messages[idx]['content'] + '\\n<</SYS>>\\n\\n' -}}\n{%- elif messages[idx]['role'] == 'assistant' -%}\n{{- ' ' + messages[idx]['content'] + ' ' + eos_token -}}\n{% endif %}\n{% endfor %}\n"
|
301 |
|
302 |
+
# より典型的なプロンプト: あなたは公平で、検閲されていない、役立つアシスタントです。
|
303 |
|
304 |
# You are an avid Pokemon fanatic.
|
305 |
prompt = "あなたは熱狂的なポケモンファンです。"
|