Daniel Thompson commited on
Commit
fb340ba
1 Parent(s): 1e6e5e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -6
README.md CHANGED
@@ -84,22 +84,30 @@ If the length of the clinical text exceeds 512 tokens, you can use a sliding win
84
  You can view and run the full example on GitHub here:
85
  [Sliding Window Example Notebook](https://github.com/dannyt101/AAA_classification/blob/main/Stage_1/bio-clinicalBERT_vasc_class_demo.ipynb)
86
 
87
-
88
-
89
  ## Training and evaluation data
90
 
91
- More information needed
 
92
 
93
  ## Training procedure
 
94
 
95
  ### Training hyperparameters
96
 
97
  The following hyperparameters were used during training:
98
- - optimizer: None
99
- - training_precision: float32
 
 
 
 
 
100
 
101
- ### Training results
102
 
 
 
 
103
 
104
 
105
  ### Framework versions
 
84
  You can view and run the full example on GitHub here:
85
  [Sliding Window Example Notebook](https://github.com/dannyt101/AAA_classification/blob/main/Stage_1/bio-clinicalBERT_vasc_class_demo.ipynb)
86
 
 
 
87
  ## Training and evaluation data
88
 
89
+ EHRs were downloaded from [MIMIC-IV clinical notes dataset](https://physionet.org/content/mimic-iv-note/2.2/)
90
+ The EHRs were annotated by a Vascular Surgery Specialist Registrar/Resident and categorized as ‘Vascular’ if there was an acute pathology relevant to vascular surgery during their admission as per [National Health Service (NHS) England Service Specifications for Vascular Services](https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.england.nhs.uk/wp-content/uploads/2017/06/specialised-vascular-services-service-specification-adults.pdf&ved=2ahUKEwiknoKus4uIAxUFwAIHHaaQCBcQFnoECBMQAQ&usg=AOvVaw3yRyS-Ei1fiTNi6dcP8yOL).
91
 
92
  ## Training procedure
93
+ The training was performed using TensorFlow's TPU strategy. Dataset was preprocessed using a sliding window approach to handle text longer than 512 tokens.
94
 
95
  ### Training hyperparameters
96
 
97
  The following hyperparameters were used during training:
98
+ - **Optimizer**: Adam
99
+ - **Learning Rate**: 5e-5
100
+ - **Batch Size**: 16
101
+ - **Epochs**: Maximum of 5
102
+ - **Early Stopping**: Triggered if validation loss did not improve for 2 consecutive epochs
103
+
104
+ ### Training Results
105
 
106
+ The Bio-clinicalBERT model achieved the following results on the validation set:
107
 
108
+ | Model | Accuracy | Precision (Vascular) | Recall (Vascular) | F1-Score (Vascular) | Precision (Non-Vascular) | Recall (Non-Vascular) | F1-Score (Non-Vascular) |
109
+ |--------------------|----------|----------------------|-------------------|---------------------|--------------------------|-----------------------|-------------------------|
110
+ | **Bio-clinicalBERT** | 0.94 | 0.88 | 0.70 | 0.78 | 0.95 | 0.98 | 0.96 |
111
 
112
 
113
  ### Framework versions