Muennighoff
commited on
Commit
•
3298a17
1
Parent(s):
bb3556d
Add arrows for code evaluation
Browse files
README.md
CHANGED
@@ -2314,9 +2314,9 @@ See this repository for JSON files: https://github.com/bigscience-workshop/evalu
|
|
2314 |
| winogrande | eng | acc ↑ | 0.71 | 0.736 |
|
2315 |
| wnli (Median of 6 prompts) | eng | acc ↑ | 0.57 | 0.563 |
|
2316 |
| wsc (Median of 11 prompts) | eng | acc ↑ | 0.519 | 0.413 |
|
2317 |
-
| humaneval | python | pass@1 | 0.155 | 0.0 |
|
2318 |
-
| humaneval | python | pass@10 | 0.322 | 0.0 |
|
2319 |
-
| humaneval | python | pass@100 | 0.555 | 0.003 |
|
2320 |
|
2321 |
|
2322 |
**Train-time Evaluation:**
|
|
|
2314 |
| winogrande | eng | acc ↑ | 0.71 | 0.736 |
|
2315 |
| wnli (Median of 6 prompts) | eng | acc ↑ | 0.57 | 0.563 |
|
2316 |
| wsc (Median of 11 prompts) | eng | acc ↑ | 0.519 | 0.413 |
|
2317 |
+
| humaneval | python | pass@1 ↑ | 0.155 | 0.0 |
|
2318 |
+
| humaneval | python | pass@10 ↑ | 0.322 | 0.0 |
|
2319 |
+
| humaneval | python | pass@100 ↑ | 0.555 | 0.003 |
|
2320 |
|
2321 |
|
2322 |
**Train-time Evaluation:**
|