Edit model card

Model Card for BiMediX-Bilingual

Model Details

  • Name: BiMediX
  • Version: 1.0
  • Type: Bilingual Medical Mixture of Experts Large Language Model (LLM)
  • Languages: English, Arabic
  • Model Architecture: Mixtral-8x7B-Instruct-v0.1
  • Training Data: BiMed1.3M, a bilingual dataset with diverse medical interactions.

Intended Use

  • Primary Use: Medical interactions in both English and Arabic.
  • Capabilities: MCQA, closed QA and chats.

Getting Started

from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "BiMediX/BiMediX-Bi"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
text = "Hello BiMediX! I've been experiencing increased tiredness in the past week."
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=500)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Procedure

  • Dataset: BiMed1.3M, 632 million healthcare specialized tokens.
  • QLoRA Adaptation: Implements a low-rank adaptation technique, incorporating learnable low-rank adapter weights into the experts and the routing network. This results in training about 4% of the original parameters.
  • Training Resources: The model underwent training on approximately 632 million tokens from the Arabic-English corpus, including 288 million tokens exclusively for English.

Model Performance

  • Benchmarks: Outperforms the baseline model and Jais-30B in medical evaluations.
Model CKG CBio CMed MedGen ProMed Ana MedMCQA MedQA PubmedQA AVG
Jais-30B 57.4 55.2 46.2 55.0 46.0 48.9 40.2 31.0 75.5 50.6
Mixtral-8x7B 59.1 57.6 52.6 59.5 53.3 54.4 43.2 40.6 74.7 55.0
BiMediX (Bilingual) 70.6 72.2 59.3 74.0 64.2 59.6 55.8 54.0 78.6 65.4

Safety and Ethical Considerations

  • Potential issues: hallucinations, toxicity, stereotypes.
  • Usage: Research purposes only.

Accessibility

Authors

Sara Pieri, Sahal Shaji Mullappilly, Fahad Shahbaz Khan, Rao Muhammad Anwer Salman Khan, Timothy Baldwin, Hisham Cholakkal
Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI)

Downloads last month
63
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.