EleutherAI
/

gpt-j-6b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Adding Evaluation Results

#38

by leaderboard-pr-bot - opened Nov 17, 2023

base: refs/heads/main

←

from: refs/pr/38

Discussion Files changed

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -166,4 +166,17 @@ Thanks to everyone who have helped out one way or another (listed alphabetically
 - [Leo Gao](https://twitter.com/nabla_theta) for running zero shot evaluations for the baseline models for the table.
 - [Laurence Golding](https://github.com/researcher2/) for adding some features to the web demo.
 - [Aran Komatsuzaki](https://twitter.com/arankomatsuzaki) for advice with experiment design and writing the blog posts.
-- [Janko Prester](https://github.com/jprester/) for creating the web demo frontend.

 - [Leo Gao](https://twitter.com/nabla_theta) for running zero shot evaluations for the baseline models for the table.
 - [Laurence Golding](https://github.com/researcher2/) for adding some features to the web demo.
 - [Aran Komatsuzaki](https://twitter.com/arankomatsuzaki) for advice with experiment design and writing the blog posts.
+- [Janko Prester](https://github.com/jprester/) for creating the web demo frontend.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_EleutherAI__gpt-j-6b)
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 34.87   |
+| ARC (25-shot)         | 41.38          |
+| HellaSwag (10-shot)   | 67.54    |
+| MMLU (5-shot)         | 26.78         |
+| TruthfulQA (0-shot)   | 35.96   |
+| Winogrande (5-shot)   | 65.98   |
+| GSM8K (5-shot)        | 1.82        |
+| DROP (3-shot)         | 4.62         |