Files changed (1) hide show
  1. README.md +11 -15
README.md CHANGED
@@ -134,9 +134,18 @@ This model is an advanced iteration of the powerful `Qwen/Qwen2.5-3B`, specifica
134
  All GGUF models are available here: [MaziyarPanahi/calme-3.3-instruct-3b-GGUF](https://huggingface.co/MaziyarPanahi/calme-3.3-instruct-3b-GGUF)
135
 
136
 
137
- # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
138
 
139
- Leaderboard 2 coming soon!
 
 
 
 
 
 
 
 
140
 
141
 
142
  # Prompt Template
@@ -183,16 +192,3 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-3.3-instruct-3
183
  # Ethical Considerations
184
 
185
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
186
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
187
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.3-instruct-3b)
188
-
189
- | Metric |Value|
190
- |-------------------|----:|
191
- |Avg. |21.55|
192
- |IFEval (0-Shot) |64.23|
193
- |BBH (3-Shot) |25.68|
194
- |MATH Lvl 5 (4-Shot)| 0.00|
195
- |GPQA (0-shot) | 4.36|
196
- |MuSR (0-shot) | 9.40|
197
- |MMLU-PRO (5-shot) |25.62|
198
-
 
134
  All GGUF models are available here: [MaziyarPanahi/calme-3.3-instruct-3b-GGUF](https://huggingface.co/MaziyarPanahi/calme-3.3-instruct-3b-GGUF)
135
 
136
 
137
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
138
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.3-instruct-3b)
139
 
140
+ | Metric |Value|
141
+ |-------------------|----:|
142
+ |Avg. |21.55|
143
+ |IFEval (0-Shot) |64.23|
144
+ |BBH (3-Shot) |25.68|
145
+ |MATH Lvl 5 (4-Shot)| 0.00|
146
+ |GPQA (0-shot) | 4.36|
147
+ |MuSR (0-shot) | 9.40|
148
+ |MMLU-PRO (5-shot) |25.62|
149
 
150
 
151
  # Prompt Template
 
192
  # Ethical Considerations
193
 
194
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.