starmpcc commited on
Commit
d2ac33c
1 Parent(s): 212539c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -2
README.md CHANGED
@@ -1,8 +1,140 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
3
  language:
4
  - en
5
- pipeline_tag: question-answering
6
  tags:
7
  - medical
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ datasets:
4
+ - starmpcc/Asclepius-Synthetic-Clinical-Notes
5
  language:
6
  - en
7
+ pipeline_tag: text2text-generation
8
  tags:
9
  - medical
10
+ ---
11
+ # Model Card for Model ID
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+ This is official model checkpoint for Asclepius-13B [arxiv](todo)
16
+ This model is the first publicly shareable clinical LLM, trained with synthetic data.
17
+
18
+ ## Model Details
19
+
20
+ ### Model Description
21
+
22
+ <!-- Provide a longer summary of what this model is. -->
23
+
24
+
25
+
26
+ - **Model type:** Clinical LLM (Large Language Model)
27
+ - **Language(s) (NLP):** English
28
+ - **License:** CC-BY-NC-SA 4.0
29
+ - **Finetuned from model [optional]:** LLaMA-13B
30
+
31
+ ### Model Sources [optional]
32
+
33
+ <!-- Provide the basic links for the model. -->
34
+
35
+ - **Repository:** https://github.com/starmpcc/Asclepius
36
+ - **Paper [optional]:** TODO Arxiv
37
+ - **Data:** https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
38
+
39
+ ## Uses
40
+
41
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
42
+ This model can perform below 8 clinical NLP tasks, with clincal notes.
43
+ - Named Entity Recognition
44
+ - Abbreviation Expansion
45
+ - Relation Extraction
46
+ - Temporal Information Extraction
47
+ - Coreference Resolution
48
+ - Paraphrasing
49
+ - Summarization
50
+ - Question Answering
51
+
52
+ ### Direct Use
53
+
54
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ### Downstream Use [optional]
59
+
60
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Out-of-Scope Use
65
+
66
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
67
+
68
+ ONLY USE THIS MODEL FOR RESEARCH PURPOSE!!
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ ```python
73
+ prompt = """You are an intelligent clinical languge model.
74
+ Below is a snippet of patient's discharge summary and a following instruction from healthcare professional.
75
+ Write a response that appropriately completes the instruction.
76
+ The response should provide the accurate answer to the instruction, while being concise.
77
+
78
+ [Discharge Summary Begin]
79
+ {note}
80
+ [Discharge Summary End]
81
+
82
+ [Instruction Begin]
83
+ {question}
84
+ [Instruction End]
85
+ """
86
+
87
+ from transformers import AutoTokenizer, AutoModel
88
+ tokenizer = AutoTokenizer.from_pretrained("starmpcc/Asclepius-13B")
89
+ model = AutoModel.from_pretrained("starmpcc/Asclepius-13B")
90
+
91
+ note = "This is a sample note"
92
+ question = "What is the diagnosis?"
93
+
94
+ model_input = prompt.format(note=note, question=question)
95
+ input_ids = tokenizer(model_input, return_tensors="pt").input_ids
96
+ output = model.generate(input_ids)
97
+ print(tokenizer.decode(output[0]))
98
+ ```
99
+
100
+ ## Training Details
101
+
102
+ ### Training Data
103
+
104
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
105
+
106
+ https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
107
+
108
+ ### Training Procedure
109
+
110
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
111
+ - Initial training was conducted using causal language modeling on synthetic clinical notes.
112
+ - It was then fine-tuned with clinical instruction-response pairs.
113
+ - For a comprehensive overview of our methods, our upcoming paper will serve as a resource.
114
+
115
+ #### Training Hyperparameters
116
+
117
+ - We followed config used in [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
118
+ -
119
+ #### Speeds, Sizes, Times [optional]
120
+
121
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
122
+ - Pre-Training (1 epoch): 1h 52m with 8x A100 80G
123
+ - Instruction Fine-Tuning (3 epoch): 12h 16m with 8x A100 80G
124
+
125
+
126
+
127
+ ## Citation [optional]
128
+
129
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
130
+
131
+ **BibTeX:**
132
+
133
+ [More Information Needed]
134
+
135
+ **APA:**
136
+
137
+ [More Information Needed]
138
+
139
+
140
+