Open source benchmarks?

#7
by jweissenberger - opened

Has anyone measured this model on opensource LLM benchmarks? The 7 billion mpt model is on the HF LLM leader board here and there are some metrics are found in this blog but I'd like to see the performance on things like Hellaswag, winogrande, PIQA, MMLU or similar benchmarks if they're available.

Edit: I found MMLU score here at 47.8

You can see all of the benchmarks you mentioned in the blog here.

jfrankle changed discussion status to closed

@jfrankle Those metrics are for your base MPT model not the instruction tuned version.

Sign up or log in to comment