Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,140 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
|
|
|
|
3 |
language:
|
4 |
- en
|
5 |
-
pipeline_tag:
|
6 |
tags:
|
7 |
- medical
|
8 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
+
datasets:
|
4 |
+
- starmpcc/Asclepius-Synthetic-Clinical-Notes
|
5 |
language:
|
6 |
- en
|
7 |
+
pipeline_tag: text2text-generation
|
8 |
tags:
|
9 |
- medical
|
10 |
+
---
|
11 |
+
# Model Card for Model ID
|
12 |
+
|
13 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
+
|
15 |
+
This is official model checkpoint for Asclepius-13B [arxiv](todo)
|
16 |
+
This model is the first publicly shareable clinical LLM, trained with synthetic data.
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
|
20 |
+
### Model Description
|
21 |
+
|
22 |
+
<!-- Provide a longer summary of what this model is. -->
|
23 |
+
|
24 |
+
|
25 |
+
|
26 |
+
- **Model type:** Clinical LLM (Large Language Model)
|
27 |
+
- **Language(s) (NLP):** English
|
28 |
+
- **License:** CC-BY-NC-SA 4.0
|
29 |
+
- **Finetuned from model [optional]:** LLaMA-13B
|
30 |
+
|
31 |
+
### Model Sources [optional]
|
32 |
+
|
33 |
+
<!-- Provide the basic links for the model. -->
|
34 |
+
|
35 |
+
- **Repository:** https://github.com/starmpcc/Asclepius
|
36 |
+
- **Paper [optional]:** TODO Arxiv
|
37 |
+
- **Data:** https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
|
38 |
+
|
39 |
+
## Uses
|
40 |
+
|
41 |
+
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
42 |
+
This model can perform below 8 clinical NLP tasks, with clincal notes.
|
43 |
+
- Named Entity Recognition
|
44 |
+
- Abbreviation Expansion
|
45 |
+
- Relation Extraction
|
46 |
+
- Temporal Information Extraction
|
47 |
+
- Coreference Resolution
|
48 |
+
- Paraphrasing
|
49 |
+
- Summarization
|
50 |
+
- Question Answering
|
51 |
+
|
52 |
+
### Direct Use
|
53 |
+
|
54 |
+
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
55 |
+
|
56 |
+
[More Information Needed]
|
57 |
+
|
58 |
+
### Downstream Use [optional]
|
59 |
+
|
60 |
+
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
61 |
+
|
62 |
+
[More Information Needed]
|
63 |
+
|
64 |
+
### Out-of-Scope Use
|
65 |
+
|
66 |
+
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
67 |
+
|
68 |
+
ONLY USE THIS MODEL FOR RESEARCH PURPOSE!!
|
69 |
+
|
70 |
+
## How to Get Started with the Model
|
71 |
+
|
72 |
+
```python
|
73 |
+
prompt = """You are an intelligent clinical languge model.
|
74 |
+
Below is a snippet of patient's discharge summary and a following instruction from healthcare professional.
|
75 |
+
Write a response that appropriately completes the instruction.
|
76 |
+
The response should provide the accurate answer to the instruction, while being concise.
|
77 |
+
|
78 |
+
[Discharge Summary Begin]
|
79 |
+
{note}
|
80 |
+
[Discharge Summary End]
|
81 |
+
|
82 |
+
[Instruction Begin]
|
83 |
+
{question}
|
84 |
+
[Instruction End]
|
85 |
+
"""
|
86 |
+
|
87 |
+
from transformers import AutoTokenizer, AutoModel
|
88 |
+
tokenizer = AutoTokenizer.from_pretrained("starmpcc/Asclepius-13B")
|
89 |
+
model = AutoModel.from_pretrained("starmpcc/Asclepius-13B")
|
90 |
+
|
91 |
+
note = "This is a sample note"
|
92 |
+
question = "What is the diagnosis?"
|
93 |
+
|
94 |
+
model_input = prompt.format(note=note, question=question)
|
95 |
+
input_ids = tokenizer(model_input, return_tensors="pt").input_ids
|
96 |
+
output = model.generate(input_ids)
|
97 |
+
print(tokenizer.decode(output[0]))
|
98 |
+
```
|
99 |
+
|
100 |
+
## Training Details
|
101 |
+
|
102 |
+
### Training Data
|
103 |
+
|
104 |
+
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
105 |
+
|
106 |
+
https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
|
107 |
+
|
108 |
+
### Training Procedure
|
109 |
+
|
110 |
+
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
111 |
+
- Initial training was conducted using causal language modeling on synthetic clinical notes.
|
112 |
+
- It was then fine-tuned with clinical instruction-response pairs.
|
113 |
+
- For a comprehensive overview of our methods, our upcoming paper will serve as a resource.
|
114 |
+
|
115 |
+
#### Training Hyperparameters
|
116 |
+
|
117 |
+
- We followed config used in [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
|
118 |
+
-
|
119 |
+
#### Speeds, Sizes, Times [optional]
|
120 |
+
|
121 |
+
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
122 |
+
- Pre-Training (1 epoch): 1h 52m with 8x A100 80G
|
123 |
+
- Instruction Fine-Tuning (3 epoch): 12h 16m with 8x A100 80G
|
124 |
+
|
125 |
+
|
126 |
+
|
127 |
+
## Citation [optional]
|
128 |
+
|
129 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
130 |
+
|
131 |
+
**BibTeX:**
|
132 |
+
|
133 |
+
[More Information Needed]
|
134 |
+
|
135 |
+
**APA:**
|
136 |
+
|
137 |
+
[More Information Needed]
|
138 |
+
|
139 |
+
|
140 |
+
|