spmurrayzzz commited on
Commit
6780f05
1 Parent(s): d95d34d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -26,3 +26,14 @@ the training dynamics specific to large language models. The dataset used in fin
26
  a "syndicate" of other open language models both of similar parameter size and larger. Each model would generate a
27
  response for a given instruction, and the group would vote on which model's response was best.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  a "syndicate" of other open language models both of similar parameter size and larger. Each model would generate a
27
  response for a given instruction, and the group would vote on which model's response was best.
28
 
29
+ ## Evaluation Results
30
+ _12.30.23_
31
+ | Benchmark | Result |
32
+ |------------|--------|
33
+ | ARC | 60.84 |
34
+ | HellaSwag | 82.91 |
35
+ | MMLU | 60.83 |
36
+ | TruthfulQA | 43.71 |
37
+ | Winogrande | 78.61 |
38
+ | GSM8K | 44.50 |
39
+