Commit
5651ff5
1 Parent(s): 2adbdae

Adding Evaluation Results (#15)

Browse files

- Adding Evaluation Results (69837072a7138c8ea4c8585c87fb9563e5eb9c80)


Co-authored-by: Open LLM Leaderboard PR Bot <[email protected]>

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -10,7 +10,6 @@ tags:
10
  - finetune
11
  - chatml
12
  base_model: Qwen/Qwen2-72B-Instruct
13
- model_name: MaziyarPanahi/calme-2.1-qwen2-72b
14
  license_name: tongyi-qianwen
15
  license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
16
  pipeline_tag: text-generation
@@ -210,3 +209,17 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b"
210
  # Ethical Considerations
211
 
212
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - finetune
11
  - chatml
12
  base_model: Qwen/Qwen2-72B-Instruct
 
13
  license_name: tongyi-qianwen
14
  license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
15
  pipeline_tag: text-generation
 
209
  # Ethical Considerations
210
 
211
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
212
+
213
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
214
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.1-qwen2-72b)
215
+
216
+ | Metric |Value|
217
+ |-------------------|----:|
218
+ |Avg. |43.61|
219
+ |IFEval (0-Shot) |81.63|
220
+ |BBH (3-Shot) |57.33|
221
+ |MATH Lvl 5 (4-Shot)|36.03|
222
+ |GPQA (0-shot) |17.45|
223
+ |MuSR (0-shot) |20.15|
224
+ |MMLU-PRO (5-shot) |49.05|
225
+