OSainz AmeyaPrabhu commited on
Commit
f82db5d
β€’
1 Parent(s): 6b722ae

Added Contamination Evidence from GPT4 Tech Report using String matching on GPT-4 (#11)

Browse files

- Update contamination_report.csv (501d7b669b6ed6b3242fa1b2edc0f2e0998f8ddf)
- Update contamination_report.csv (ea371e52809b1765ff90fa3f59e3111253e32cf3)
- Update contamination_report.csv (5461dcb28b875e2b4c29ec19ea39e8206868fc12)
- Update contamination_report.csv (47ead6832a18291cc0ba377ab9db3edc3c9e3739)
- Merge branch 'main' of https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report into pr/11 (ec0bb5d49f82f4302719206ee204bc427698418f)


Co-authored-by: Ameya Prabhu <[email protected]>

Files changed (1) hide show
  1. contamination_report.csv +8 -0
contamination_report.csv CHANGED
@@ -483,6 +483,14 @@ RadNLI;;GPT-4;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
483
  RadNLI;;GPT-3.5;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
484
 
485
 
 
 
 
 
 
 
 
 
486
  cais/mmlu;;GPT-3.5;model;;;52.0;model-based;https://arxiv.org/abs/2311.09783;10
487
  winogrande;;GPT-3.5;model;;;9.0;model-based;https://arxiv.org/abs/2311.09783;10
488
  truthful_qa;;GPT-3.5;model;;;12.0;model-based;https://arxiv.org/abs/2311.09783;10
 
483
  RadNLI;;GPT-3.5;model;0.0;0.0;0.0;model-based;https://arxiv.org/pdf/2308.08493;8
484
 
485
 
486
+ openai_humaneval;;GPT-4;model;;;25.0;data-based;https://arxiv.org/abs/2303.08774;11
487
+ ucinlp/drop;;GPT-4;model;;21.0;;data-based;https://arxiv.org/abs/2303.08774;11
488
+ bigbench;;GPT-4;model;;;100.0;data-based;https://arxiv.org/abs/2303.08774;11
489
+ gsm8k;;GPT-4;model;100.0;;1.0;data-based;https://arxiv.org/abs/2303.08774;11
490
+ EleutherAI/hendrycks_math;;GPT-4;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
491
+ cais/mmlu;;GPT-4;model;;;0.6;data-based;https://arxiv.org/abs/2303.08774;11
492
+ ibragim-bad/arc_challenge;;GPT-4;model;;;3.4;data-based;https://arxiv.org/abs/2303.08774;11
493
+ winogrande;;GPT-4;model;;;0.9;data-based;https://arxiv.org/abs/2303.08774;11
494
  cais/mmlu;;GPT-3.5;model;;;52.0;model-based;https://arxiv.org/abs/2311.09783;10
495
  winogrande;;GPT-3.5;model;;;9.0;model-based;https://arxiv.org/abs/2311.09783;10
496
  truthful_qa;;GPT-3.5;model;;;12.0;model-based;https://arxiv.org/abs/2311.09783;10