Edit model card

Introducing JiviMed-8B_v1: The Cutting-Edge Biomedical Language Model

JiviMed-8B stands as a pinnacle in language modeling tailored specifically for the biomedical sector. Developed by Jivi AI , this model incorporates the latest advancements to deliver unparalleled performance across a wide range of biomedical applications.

Tailored for Medicine: JiviMed-8B is meticulously designed to cater to the specialized language and knowledge requirements of the medical and life sciences industries. It has been fine-tuned using an extensive collection of high-quality biomedical data, enhancing its ability to accurately comprehend and generate domain-specific text.

Unmatched Performance: With 8 billion parameters, JiviMed-8B outperforms other open-source biomedical language models of similar size. It demonstrates superior results over larger models, both proprietary and open-source, such as GPT-3.5, Meditron-70B, and Gemini 1.0, in various biomedical benchmarks.

Enhanced Training Methodologies: JiviMed-8B builds upon the robust frameworks of the Meta-Llama-3-8B models, integrating a specially curated diverse medical dataset along with ORPO fine-tuning strategy. Key elements of our training process include:

1. Intensive Data Preparation: Over 100,000+ data points have been meticulously curated to ensure the model is well-versed in the nuances of biomedical language.
2. Hyperparameter Tuning: Hyperparameter adjustments are carefully optimized to enhance learning efficiency without encountering catastrophic forgetting, thus maintaining robust performance across tasks.

JiviMed-8B redefines what's possible in biomedical language modeling, setting new standards for accuracy, versatility, and performance in the medical domain.

Model Comparison

Model Name Average MedMCQA MedQA MMLU Anatomy MMLU Clinical Knowledge MMLU College Biology MMLU College Medicine MMLU Medical Genetics MMLU Professional Medicine PubMedQA
Jivi_medium_med_v1 75.53 60.1 60.04 77.04 82.26 86.81 73.41 86 80.08 72.6
Flan:PaLM 74.7 57.6 67.6 63.7 80.4 88.9 76.3 75 83.8 79
winninghealth/WiNGPT2-Llama-3-8B-Base 72.1 55.65 67.87 69.63 75.09 78.47 65.9 84 78.68 73.6
meta-llama/Meta-Llama-3-8B 69.9 57.47 59.7 68.89 74.72 78.47 61.85 83 70.22 74.8
meta-llama/Meta-Llama-3-8B 69.81 57.69 60.02 68.89 74.72 78.47 60.12 83 70.22 75.2
unsloth/gemma-7b 64.18 48.96 47.21 59.26 69.81 79.86 60.12 70 66.18 76.2
mistralai/Mistral-7B-V9.1 62.85 48.2 50.82 55.56 68.68 68.06 59.54 71 68.38 75.4
BioMistral/BioMistral-7B-Zephyr-Beta-SLeRP 61.52 46.52 50.2 55.56 63.02 65.28 61.27 72 63.24 76.6
BioMistral/BioMistral-7B-SLERP 59.58 44.13 47.29 51.85 66.42 65.28 58.96 69 55.88 77.4
BioMistral/BioMistral-7B-DARE 59.45 44.66 47.37 53.33 66.42 62.5 58.96 68 56.25 77.6
OpenModel s4all/gemma-1-7b-it 58.37 44.56 45.01 52.59 62.64 68.75 57.23 67 55.15 72.4
medalpaca/medalpaca-7b 58.03 37.51 41.71 57.04 57.36 65.28 54.34 69 67.28 72.8
BioMistral/BioMistral-7B 56.36 41.48 46.11 51.11 63.77 61.11 53.76 66 52.94 71

model_accuracy

Hyperparametes:

Peft

  • lora_r: 64
  • lora_alpha: 128
  • lora_dropout: 0.05
  • lora_target_linear: true

Target_Modules

  • q_proj
  • v_proj
  • k_proj
  • o_proj
  • gate_proj
  • down_proj
  • up_proj
Downloads last month
16
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.