Model Card for Model ID

This is an official model checkpoint for Asclepius-Llama3-8B (arxiv). This model is an enhanced version of Asclepius-7B, by replacing the base model with Llama-3 and increasing the max sequence length to 8192.

UPDATE

2024.01.10

Asclepius-R, the variant of Asclepius that trained on MIMIC-III discharge summaries, is now available on Physionet!

Model Details

Model Description

Model type: Clinical LLM (Large Language Model)
Language(s) (NLP): English
License: CC-BY-NC-SA 4.0
Finetuned from model [optional]: Llama3-8B

Model Sources [optional]

Repository: https://github.com/starmpcc/Asclepius
Paper: https://arxiv.org/abs/2309.00237
Data: https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes

Uses

This model can perform below 8 clinical NLP tasks, with clincal notes.

Named Entity Recognition
Abbreviation Expansion
Relation Extraction
Temporal Information Extraction
Coreference Resolution
Paraphrasing
Summarization
Question Answering

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

ONLY USE THIS MODEL FOR RESEARCH PURPOSE!!

How to Get Started with the Model

prompt = """You are an intelligent clinical languge model.
Below is a snippet of patient's discharge summary and a following instruction from healthcare professional.
Write a response that appropriately completes the instruction.
The response should provide the accurate answer to the instruction, while being concise.

[Discharge Summary Begin]
{note}
[Discharge Summary End]

[Instruction Begin]
{question}
[Instruction End] 
"""

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("starmpcc/Asclepius-Llama3-8B", use_fast=False)
model = AutoModelForCausalLM.from_pretrained("starmpcc/Asclepius-Llama3-8B")

note = "This is a sample note"
question = "What is the diagnosis?"

model_input = prompt.format(note=note, question=question)
input_ids = tokenizer(model_input, return_tensors="pt").input_ids
output = model.generate(input_ids)
print(tokenizer.decode(output[0]))

Training Details

Training Data

https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes

Training Procedure

Initial training was conducted using causal language modeling on synthetic clinical notes.
It was then fine-tuned with clinical instruction-response pairs.
For a comprehensive overview of our methods, our upcoming paper will serve as a resource.

Training Hyperparameters

We followed config used in Stanford Alpaca

Speeds, Sizes, Times

Pre-Training (1 epoch): 2h 59m with 4x A100 80G
Instruction Fine-Tuning (3 epoch): 30h 41m with 4x A100 80G

Citation

BibTeX:

@article{kweon2023publicly,
  title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes},
  author={Kweon, Sunjun and Kim, Junu and Kim, Jiyoun and Im, Sujeong and Cho, Eunbyeol and Bae, Seongsu and Oh, Jungwoo and Lee, Gyubok and Moon, Jong Hak and You, Seng Chan and others},
  journal={arXiv preprint arXiv:2309.00237},
  year={2023}
}

starmpcc
/

Asclepius-Llama3-8B