test3_sft_4bit / README.md
alnrg2arg's picture
Update README.md
b69a40f verified
metadata
language:
  - en
license: cc-by-nc-4.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - mistral
  - trl
base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4
datasets:
  - Open-Orca/SlimOrca

Benchmark Scores

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.5247 ± 0.0146
none 0 acc_norm 0.5623 ± 0.0145
Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc 0.6270 ± 0.0048
none 0 acc_norm 0.8228 ± 0.0038
Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6243 ± 0.1341
- humanities N/A none 0 acc 0.5717 ± 0.1400
- other N/A none 0 acc 0.7016 ± 0.1143
- social_sciences N/A none 0 acc 0.7342 ± 0.0753
- stem N/A none 0 acc 0.5192 ± 0.1257
Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.7774 ± 0.0117
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.6732 ± 0.0129
Tasks Version Filter n-shot Metric Value Stderr
truthfulqa_mc2 2 none 0 acc 0.4795 ± 0.0148

Average 65.658