Spaces:

bigcode
/

bigcodebench-leaderboard

Running

Terry Zhuo commited on Jun 20

Commit

889b484

•

1 Parent(s): 702e87d

update notes

Files changed (1) hide show

app.py CHANGED Viewed

@@ -248,7 +248,7 @@ with demo:
                         - <u>Instruct</u> (🔥Vibe Check🔥): Code Generation based on the (less verbose) NL-oriented instructions. This variant tests if the models are really capable enough to understand human intents to code.
                     - `complete` and `instruct` represent the calibrated Pass@1 score on the BigCodeBench benchmark variants.
                     - `elo_mle` represents the task-level Bootstrap of Maximum Likelihood Elo rating on `BigCodeBench-Complete`, which starts from 1000 and is boostrapped 500 times.
-                    - `size` (optional) is the amount of activated model weight during inference.
                     - Model providers have the responsibility to avoid data contamination. Models trained on close data can be affected by contamination.
                     - For more details check the 📝 About section.
                     """,

                         - <u>Instruct</u> (🔥Vibe Check🔥): Code Generation based on the (less verbose) NL-oriented instructions. This variant tests if the models are really capable enough to understand human intents to code.
                     - `complete` and `instruct` represent the calibrated Pass@1 score on the BigCodeBench benchmark variants.
                     - `elo_mle` represents the task-level Bootstrap of Maximum Likelihood Elo rating on `BigCodeBench-Complete`, which starts from 1000 and is boostrapped 500 times.
+                    - `size` is the amount of activated model weight during inference.
                     - Model providers have the responsibility to avoid data contamination. Models trained on close data can be affected by contamination.
                     - For more details check the 📝 About section.
                     """,