health360
/

Healix-1.1B-V1-Chat-dDPO

@@ -1,14 +1,117 @@
 ---
-datasets:
-- krvhrv/Healix-Medical-Shot
 language:
 - en
 tags:
 - medical
 - biology
 - chemistry
 - text-generation-inference
-license: apache-2.0
 ---
 # Healix 1.1B Model Card
@@ -31,4 +134,17 @@ Users are urged to use Healix 1.1B responsibly, considering the ethical implicat
 Details on the development, training, and evaluation of Healix 1.1B will be available in our forthcoming publications, offering insights into its creation and the advancements it brings to medical informatics.
 ### Input Format
-Use the Alpaca model format.

 ---
 language:
 - en
+license: apache-2.0
 tags:
 - medical
 - biology
 - chemistry
 - text-generation-inference
+datasets:
+- krvhrv/Healix-Medical-Shot
+model-index:
+- name: Healix-1.1B-V1-Chat-dDPO
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc_norm
+      value: 30.55
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HellaSwag (10-Shot)
+      type: hellaswag
+      split: validation
+      args:
+        num_few_shot: 10
+    metrics:
+    - type: acc_norm
+      value: 44.78
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 24.64
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA (0-shot)
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: mc2
+      value: 41.55
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande (5-shot)
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 56.51
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 0.0
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=health360/Healix-1.1B-V1-Chat-dDPO
+      name: Open LLM Leaderboard
 ---
 # Healix 1.1B Model Card
 Details on the development, training, and evaluation of Healix 1.1B will be available in our forthcoming publications, offering insights into its creation and the advancements it brings to medical informatics.
 ### Input Format
+Use the Alpaca model format.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_health360__Healix-1.1B-V1-Chat-dDPO)
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |33.00|
+|AI2 Reasoning Challenge (25-Shot)|30.55|
+|HellaSwag (10-Shot)              |44.78|
+|MMLU (5-Shot)                    |24.64|
+|TruthfulQA (0-shot)              |41.55|
+|Winogrande (5-shot)              |56.51|
+|GSM8k (5-shot)                   | 0.00|