Update README.md
Browse files
README.md
CHANGED
@@ -60,7 +60,7 @@ As of 1 Oct 2024, Llama-3.1-Nemotron-70B-Instruct performs best on Arena Hard, A
|
|
60 |
|
61 |
| Model | Arena Hard | AlpacaEval | MT-Bench | Mean Response Length |
|
62 |
|:-----------------------------|:----------------|:-----|:----------|:-------|
|
63 |
-
|Details | (95% CI) | 2 LC (SE) | (GPT-4-Turbo) | (# of Characters)|
|
64 |
| _**Llama-3.1-Nemotron-70B-Instruct**_ | **85.0** (-1.5, 1.5) | **57.6** (1.65) | **8.98** | 2199.8 |
|
65 |
| Llama-3.1-70B-Instruct | 55.7 (-2.9, 2.7) | 38.1 (0.90) | 8.22 | 1728.6 |
|
66 |
| Llama-3.1-405B-Instruct | 69.3 (-2.4, 2.2) | 39.3 (1.43) | 8.49 | 1664.7 |
|
|
|
60 |
|
61 |
| Model | Arena Hard | AlpacaEval | MT-Bench | Mean Response Length |
|
62 |
|:-----------------------------|:----------------|:-----|:----------|:-------|
|
63 |
+
|Details | (95% CI) | 2 LC (SE) | (GPT-4-Turbo) | (# of Characters for MT Bench)|
|
64 |
| _**Llama-3.1-Nemotron-70B-Instruct**_ | **85.0** (-1.5, 1.5) | **57.6** (1.65) | **8.98** | 2199.8 |
|
65 |
| Llama-3.1-70B-Instruct | 55.7 (-2.9, 2.7) | 38.1 (0.90) | 8.22 | 1728.6 |
|
66 |
| Llama-3.1-405B-Instruct | 69.3 (-2.4, 2.2) | 39.3 (1.43) | 8.49 | 1664.7 |
|