Update README.md
Browse files
README.md
CHANGED
@@ -38,14 +38,18 @@ For more details on the training mixture, read the paper: [Camels in a Changing
|
|
38 |
|
39 |
| Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | TydiQA 1-shot Gold Passage | Codex HumanEval Pass@10 |AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
|
40 |
|-|-|-|-|-|-|-|-|-|-|-|-|-|
|
41 |
-
| Llama 3 8b base | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
|
42 |
-
| Llama 3 8b instruct | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
|
43 |
-
| **Llama 3 Tulu 2 8b (this model)** | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
|
44 |
-
| Llama 3 Tulu 2+DPO 8b | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
|
45 |
-
| Llama 3 70b base | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
|
46 |
-
| Llama 3 70b instruct | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
|
47 |
-
| Llama 3 Tulu 2 70b | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
|
48 |
-
| Llama 3 Tulu 2+DPO 70b | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
|
|
|
|
|
|
|
|
|
49 |
|
50 |
## Input Format
|
51 |
|
|
|
38 |
|
39 |
| Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | TydiQA 1-shot Gold Passage | Codex HumanEval Pass@10 |AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
|
40 |
|-|-|-|-|-|-|-|-|-|-|-|-|-|
|
41 |
+
| [Llama 3 8b base](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
|
42 |
+
| [Llama 3 8b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
|
43 |
+
| **[Llama 3 Tulu 2 8b](https://huggingface.co/allenai/llama-3-tulu-2-8b) (this model)** | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
|
44 |
+
| [Llama 3 Tulu 2+DPO 8b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-8b) | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
|
45 |
+
| [Llama 3 70b base](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
|
46 |
+
| [Llama 3 70b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
|
47 |
+
| [Llama 3 Tulu 2 70b](https://huggingface.co/allenai/llama-3-tulu-2-70b) | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
|
48 |
+
| [Llama 3 Tulu 2+DPO 70b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b) | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
|
49 |
+
|
50 |
+
We also release reward models based off Llama 3 8b and 70b respectively:
|
51 |
+
- [Llama 3 Tulu 2 8b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)
|
52 |
+
- [Llama 3 Tulu 2 70b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm)
|
53 |
|
54 |
## Input Format
|
55 |
|