Commit
β’
f82db5d
1
Parent(s):
6b722ae
Added Contamination Evidence from GPT4 Tech Report using String matching on GPT-4 (#11)
Browse files- Update contamination_report.csv (501d7b669b6ed6b3242fa1b2edc0f2e0998f8ddf)
- Update contamination_report.csv (ea371e52809b1765ff90fa3f59e3111253e32cf3)
- Update contamination_report.csv (5461dcb28b875e2b4c29ec19ea39e8206868fc12)
- Update contamination_report.csv (47ead6832a18291cc0ba377ab9db3edc3c9e3739)
- Merge branch 'main' of https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report into pr/11 (ec0bb5d49f82f4302719206ee204bc427698418f)
Co-authored-by: Ameya Prabhu <[email protected]>
- contamination_report.csv +8 -0
contamination_report.csv
CHANGED
@@ -483,6 +483,14 @@ RadNLI;;GPT-4;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
|
|
483 |
RadNLI;;GPT-3.5;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
|
484 |
|
485 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
486 |
cais/mmlu;;GPT-3.5;model;;;52.0;model-based;https://arxiv.org/abs/2311.09783;10
|
487 |
winogrande;;GPT-3.5;model;;;9.0;model-based;https://arxiv.org/abs/2311.09783;10
|
488 |
truthful_qa;;GPT-3.5;model;;;12.0;model-based;https://arxiv.org/abs/2311.09783;10
|
|
|
483 |
RadNLI;;GPT-3.5;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
|
484 |
|
485 |
|
486 |
+
openai_humaneval;;GPT-4;model;;;25.0;data-based;https://arxiv.org/abs/2303.08774;11
|
487 |
+
ucinlp/drop;;GPT-4;model;;21.0;;data-based;https://arxiv.org/abs/2303.08774;11
|
488 |
+
bigbench;;GPT-4;model;;;100.0;data-based;https://arxiv.org/abs/2303.08774;11
|
489 |
+
gsm8k;;GPT-4;model;100.0;;1.0;data-based;https://arxiv.org/abs/2303.08774;11
|
490 |
+
EleutherAI/hendrycks_math;;GPT-4;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
|
491 |
+
cais/mmlu;;GPT-4;model;;;0.6;data-based;https://arxiv.org/abs/2303.08774;11
|
492 |
+
ibragim-bad/arc_challenge;;GPT-4;model;;;3.4;data-based;https://arxiv.org/abs/2303.08774;11
|
493 |
+
winogrande;;GPT-4;model;;;0.9;data-based;https://arxiv.org/abs/2303.08774;11
|
494 |
cais/mmlu;;GPT-3.5;model;;;52.0;model-based;https://arxiv.org/abs/2311.09783;10
|
495 |
winogrande;;GPT-3.5;model;;;9.0;model-based;https://arxiv.org/abs/2311.09783;10
|
496 |
truthful_qa;;GPT-3.5;model;;;12.0;model-based;https://arxiv.org/abs/2311.09783;10
|