fdqerq22ds/MathScale-Mistral

Overview

This is a reproduced MathScale-Mistral model by finetuning the Mistral-7B-v0.1 on our reproduced MathScaleQA-2M dataset, following the hyperparameters in their original paper to ensure the effectiveness of our reproduction.

Reproduction Details

Fortunately, the reproduction was smooth, and we managed to match the reported performance metrics when evaluating on their MWPBench. Below, we present a comparison between the performance of their official model and our reproduced model:

Model	GSM8K	MATH	CollegeMath	TAL	Math23k	Ape210k	GaokaoBench-Math	AGIE-Gaokao-Math	AGIE-SAT-Math	AGIE-MATH	MicroAverage	MacroAverage
Official MathScale-Mistral	74.8	35.2	21.8	39.9	64.4	46.0	21.4	14.3	57.8	32.9	38.7	40.8
Reproduced MathScale-Mistral	74.0	34.5	22.0	39.6	61.7	45.1	21.6	15.5	56.8	34.4	38.3	40.5

fdqerq22ds
/

MathScale-Mistral

Overview

Reproduction Details

Dataset used to train fdqerq22ds/MathScale-Mistral