arcee-ai
/

raspberry-3B

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

raspberry-3B / README.md

qnguyen3's picture

Adding Evaluation Results (#2)

40e603a verified 10 days ago

|

1.64 kB

	---
	license: other
	library_name: transformers
	tags:
	- generated_from_trainer
	base_model: Qwen/Qwen2.5-3B
	license_name: qwen-research
	license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
	model-index:
	- name: outputs/gelato-3b
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	Prompt Format: ChatML

	This is an experimental which was heavily optimized for reasoning tasks and not meant for production-use.

	GGUFs: https://huggingface.co/mradermacher/raspberry-3B-GGUF

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/L45Szb9WeV-K_bxS8aFoj.png)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/630430583926de1f7ec62c6b/GQtNdAaoXZXwf4noU883B.png)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|29.79\|
	\|IFEval (0-Shot) \|32.12\|
	\|BBH (3-Shot) \|42.23\|
	\|MATH Lvl 5 (4-Shot)\| 8.16\|
	\|GPQA (0-shot) \|27.10\|
	\|MuSR (0-shot) \|40.61\|
	\|MMLU-PRO (5-shot) \|28.49\|


	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_arcee-ai__raspberry-3B)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|15.40\|
	\|IFEval (0-Shot) \|31.54\|
	\|BBH (3-Shot) \|19.53\|
	\|MATH Lvl 5 (4-Shot)\| 7.63\|
	\|GPQA (0-shot) \| 3.69\|
	\|MuSR (0-shot) \| 9.41\|
	\|MMLU-PRO (5-shot) \|20.60\|