pansophic commited on
Commit
6b65196
1 Parent(s): a5ba4ce

Update benchmarks in README

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -67,13 +67,14 @@ In AlpacaEval, Rocket 🦝 achieves a near 80% win rate, coupled with an average
67
 
68
  | Metric | Value |
69
  |-----------------------|---------------------------|
 
70
  | ARC (25-shot) | 50.51 |
71
- | HellaSwag (0-shot) | 73.91 |
72
- | TruthfulQA (mc2) (0-shot) | 54.38 |
73
- | BoolQ (0-shot) | 81.71 |
74
  | Winogrande (5-shot) | 67.8 |
75
  | GSM8K (5-shot) | 37.91 |
76
- | MathQA (5-shot) | 31.26 |
77
 
78
 
79
  ## Intended uses & limitations
 
67
 
68
  | Metric | Value |
69
  |-----------------------|---------------------------|
70
+ | Average | 51.00 |
71
  | ARC (25-shot) | 50.51 |
72
+ | HellaSwag (10-shot) | 76.45 |
73
+ | MMLU (5-shot) | 45.51 |
74
+ | TruthfulQA (0-shot) | 54.38 |
75
  | Winogrande (5-shot) | 67.8 |
76
  | GSM8K (5-shot) | 37.91 |
77
+ | DROP (3-shot) | 24.49 |
78
 
79
 
80
  ## Intended uses & limitations