lvkaokao
commited on
Commit
•
fd81216
1
Parent(s):
14f9323
update metric from llm leaderboard.
Browse files
README.md
CHANGED
@@ -11,13 +11,13 @@ Neural-chat-7b-v3 was trained between September and October, 2023.
|
|
11 |
|
12 |
## Evaluation
|
13 |
|
14 |
-
We
|
15 |
-
|
16 |
-
| Model | Average ⬆️| ARC (25-s) ⬆️ | HellaSwag (10-s) ⬆️ | MMLU (5-s) ⬆️| TruthfulQA (MC) (0-s) ⬆️ |
|
17 |
-
| --- | --- | --- | --- | --- | --- |
|
18 |
-
|[mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 62.4 | 59.58 | 83.31 | 64.16 | 42.15 |
|
19 |
-
| **Ours** | **67.92** | 66.29 | 83.28 | 62.11 | 60.02 |
|
20 |
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## Training procedure
|
23 |
|
|
|
11 |
|
12 |
## Evaluation
|
13 |
|
14 |
+
We submit our model to [open_llm_leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), and the model performance has been **improved significantly** as we see from the average metric of 7 tasks from the leaderboard.
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
+
| Model | Average ⬆️| ARC (25-s) ⬆️ | HellaSwag (10-s) ⬆️ | MMLU (5-s) ⬆️| TruthfulQA (MC) (0-s) ⬆️ | Winogrande (5-s) | GSM8K (5-s) | DROP (3-s) |
|
17 |
+
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
18 |
+
|[mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) | 50.32 | 59.58 | 83.31 | 64.16 | 42.15 | 78.37 | 18.12 | 6.14 |
|
19 |
+
| [Intel/neural-chat-7b-v3](https://huggingface.co/Intel/neural-chat-7b-v3) | **57.31** | 67.15 | 83.29 | 62.26 | 58.77 | 78.06 | 1.21 | 50.43 |
|
20 |
+
| [Intel/neural-chat-7b-v3](https://huggingface.co/Intel/neural-chat-7b-v3) | **59.06** | 66.21 | 83.64 | 62.37 | 59.65 | 78.14 | 19.56 | 43.84 |
|
21 |
|
22 |
## Training procedure
|
23 |
|