Commit
•
5326ae8
1
Parent(s):
ec1bdaa
Update README.md (#1)
Browse files- Update README.md (ac20a97d35d8d6ee4136516d97319b3d730531fd)
Co-authored-by: Alexandre Marques <[email protected]>
README.md
CHANGED
@@ -47,13 +47,13 @@ Model evaluation metrics and results.
|
|
47 |
|
48 |
| Benchmark | Metric | Llama-2-7b | Llama-2-7b-pruned50-retrained |
|
49 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
50 |
-
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot
|
51 |
-
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot |
|
52 |
-
| [WinoGrande](https://arxiv.org/abs/1907.10641) |
|
53 |
-
| [ARC-c](https://arxiv.org/abs/1911.01547) |
|
54 |
-
| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot |
|
55 |
-
| [
|
56 |
-
| [
|
57 |
|
58 |
## Model Training Details
|
59 |
|
|
|
47 |
|
48 |
| Benchmark | Metric | Llama-2-7b | Llama-2-7b-pruned50-retrained |
|
49 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
50 |
+
| [MMLU](https://arxiv.org/abs/2009.03300) | 5-shot | 46.9% | 41.3% |
|
51 |
+
| [HellaSwag](https://arxiv.org/abs/1905.07830) | 0-shot | 78.6% | 76.5% |
|
52 |
+
| [WinoGrande](https://arxiv.org/abs/1907.10641) | 5-shot | 74.0% | 72.1% |
|
53 |
+
| [ARC-c](https://arxiv.org/abs/1911.01547) | 25-shot | 53.1% | 49.8% |
|
54 |
+
| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot | 38.8% | 37.7% |
|
55 |
+
| [GSM8K](https://arxiv.org/abs/2110.14168) | 5-shot | 14.5% | 9.17% |
|
56 |
+
| [HumanEval](https://arxiv.org/abs/2107.03374) | pass@1 | 13.4% | 14.7% |
|
57 |
|
58 |
## Model Training Details
|
59 |
|