UMCU's picture
Update README.md
c349ccc verified
---
tags:
- spacy
- arxiv:2408.06930
- medical
language:
- nl
license: gpl-3.0
model-index:
- name: Echocardiogram_Multimodel_reduced
results:
- task:
type: text-classification
dataset:
type: test
name: internal test set
metrics:
- name: Macro f1
type: f1
value: 0.946
verified: false
- name: Macro precision
type: precision
value: 0.946
verified: false
- name: Macro recall
type: recall
value: 0.945
verified: false
pipeline_tag: text-classification
metrics:
- f1
- precision
- recall
---
# Description
This model is a [MedRoBERTa.nl](https://huggingface.co/CLTL/MedRoBERTa.nl) model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records.
The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930.
The config file for training the model can be found at https://github.com/umcu/echolabeler.
# Minimum working example
```python
from transformer import pipeline
```
```python
le_pipe = pipeline(model="UMCU/Echocardiogram_Multimodel_reduced")
document = "Lorem ipsum"
results = le_pipe(document)
```
# Label Scheme
<details>
<summary>View label scheme</summary>
| Component | Labels |
| --- | --- |
| **`bespoke`** | `pe_Present`, `rv_dil_Present`, `wma_Present`, `lv_dil_Present`, `aortic_valve_native_stenosis_Present`, `mitral_valve_native_regurgitation_Present`, `lv_sys_func_Present`, `rv_sys_func_Present`, `aortic_valve_native_regurgitation_Present`, `lv_dias_func_Present`,`Normal_or_No_Label`, `tricuspid_valve_native_regurgitation_Present` |
| **`reduced`** | `Normal_or_No_Label`, `Present` |
</details>
Here, for the reduced labels `Present` means that for *any one or multiple* of the pathologies we have a positive result.
Here, for the pathologies we have
<details>
<summary>View pathologies</summary>
| Annotation | Pathology |
| --- | --- |
| pe | Pericardial Effusion |
| wma | Wall Motion Abnormality |
| lv_dil | Left Ventricle Dilation |
| rv_dil | Right Ventricle Dilation |
| lv_syst_func | Left Ventricle Systolic Dysfunction |
| rv_syst_func | Right Ventricle Systolic Dysfunction |
| lv_dias_func | Diastolic Dysfunction |
| aortic_valve_native_stenosis | Aortic Stenosis |
| mitral_valve_native_regurgitation | Mitral valve regurgitation |
| tricuspid_valve_native_regurgitation | Tricuspid regurgitation |
| aortic_valve_native_regurgitation | Aortic Regurgitation |
</details>
Note: `lv_dias_func` should have been `dias_func`..
# Intended use
The model is developed for *document* classification of Dutch clinical echocardiogram reports.
Since it is a domain-specific model trained on medical data, it is **only** meant to be used on medical NLP tasks for *Dutch echocardiogram reports*.
# Data
The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht.
The training data was anonymized before starting the training procedure.
| Feature | Description |
| --- | --- |
| **Name** | `Echocardiogram_SpanCategorizer_aortic_stenosis` |
| **Version** | `1.0.0` |
| **transformers** | `>=4.40.0` |
| **Default Pipeline** | `pipeline`, `text-classification` |
| **Components** | `RobertaForSequenceClassification` |
| **License** | `cc-by-sa-4.0` |
| **Author** | [Bram van Es]() |
# Contact
If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
# Usage
If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
# References
Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930