Jayantjivi/jivi_med_mid · Hugging Face

Introducing JiviMed-8B_v1: The Cutting-Edge Biomedical Language Model

JiviMed-8B stands as a pinnacle in language modeling tailored specifically for the biomedical sector. Developed by Jivi AI , this model incorporates the latest advancements to deliver unparalleled performance across a wide range of biomedical applications.

Tailored for Medicine: JiviMed-8B is meticulously designed to cater to the specialized language and knowledge requirements of the medical and life sciences industries. It has been fine-tuned using an extensive collection of high-quality biomedical data, enhancing its ability to accurately comprehend and generate domain-specific text.

Unmatched Performance: With 8 billion parameters, JiviMed-8B outperforms other open-source biomedical language models of similar size. It demonstrates superior results over larger models, both proprietary and open-source, such as GPT-3.5, Meditron-70B, and Gemini 1.0, in various biomedical benchmarks.

Enhanced Training Methodologies: JiviMed-8B builds upon the robust frameworks of the Meta-Llama-3-8B models, integrating a specially curated diverse medical dataset along with ORPO fine-tuning strategy. Key elements of our training process include:

1. Intensive Data Preparation: Over 100,000+ data points have been meticulously curated to ensure the model is well-versed in the nuances of biomedical language.
2. Hyperparameter Tuning: Hyperparameter adjustments are carefully optimized to enhance learning efficiency without encountering catastrophic forgetting, thus maintaining robust performance across tasks.

JiviMed-8B redefines what's possible in biomedical language modeling, setting new standards for accuracy, versatility, and performance in the medical domain.

Model Comparison

Model Name	Average	MedMCQA	MedQA	MMLU Anatomy	MMLU Clinical Knowledge	MMLU College Biology	MMLU College Medicine	MMLU Medical Genetics	MMLU Professional Medicine	PubMedQA
Jivi_medium_med_v1	75.53	60.1	60.04	77.04	82.26	86.81	73.41	86	80.08	72.6
Flan:PaLM	74.7	57.6	67.6	63.7	80.4	88.9	76.3	75	83.8	79
winninghealth/WiNGPT2-Llama-3-8B-Base	72.1	55.65	67.87	69.63	75.09	78.47	65.9	84	78.68	73.6
meta-llama/Meta-Llama-3-8B	69.9	57.47	59.7	68.89	74.72	78.47	61.85	83	70.22	74.8
meta-llama/Meta-Llama-3-8B	69.81	57.69	60.02	68.89	74.72	78.47	60.12	83	70.22	75.2
unsloth/gemma-7b	64.18	48.96	47.21	59.26	69.81	79.86	60.12	70	66.18	76.2
mistralai/Mistral-7B-V9.1	62.85	48.2	50.82	55.56	68.68	68.06	59.54	71	68.38	75.4
BioMistral/BioMistral-7B-Zephyr-Beta-SLeRP	61.52	46.52	50.2	55.56	63.02	65.28	61.27	72	63.24	76.6
BioMistral/BioMistral-7B-SLERP	59.58	44.13	47.29	51.85	66.42	65.28	58.96	69	55.88	77.4
BioMistral/BioMistral-7B-DARE	59.45	44.66	47.37	53.33	66.42	62.5	58.96	68	56.25	77.6
OpenModel s4all/gemma-1-7b-it	58.37	44.56	45.01	52.59	62.64	68.75	57.23	67	55.15	72.4
medalpaca/medalpaca-7b	58.03	37.51	41.71	57.04	57.36	65.28	54.34	69	67.28	72.8
BioMistral/BioMistral-7B	56.36	41.48	46.11	51.11	63.77	61.11	53.76	66	52.94	71

Hyperparametes:

Peft

lora_r: 64
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true

Target_Modules

q_proj
v_proj
k_proj
o_proj
gate_proj
down_proj
up_proj