tags:
- spacy
- arxiv:2408.06930
- medical
language:
- nl
license: gpl-3.0
model-index:
- name: Echocardiogram_Multimodel_reduced
results:
- task:
type: text-classification
dataset:
type: test
name: internal test set
metrics:
- name: Macro f1
type: f1
value: 0.946
verified: false
- name: Macro precision
type: precision
value: 0.946
verified: false
- name: Macro recall
type: recall
value: 0.945
verified: false
pipeline_tag: text-classification
metrics:
- f1
- precision
- recall
Description
This model is a MedRoBERTa.nl model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records. The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930. The config file for training the model can be found at https://github.com/umcu/echolabeler.
Minimum working example
from transformer import pipeline
le_pipe = pipeline(model="UMCU/Echocardiogram_Multimodel_reduced")
document = "Lorem ipsum"
results = le_pipe(document)
Label Scheme
View label scheme
Component | Labels |
---|---|
bespoke |
pe_Present , rv_dil_Present , wma_Present , lv_dil_Present , aortic_valve_native_stenosis_Present , mitral_valve_native_regurgitation_Present , lv_sys_func_Present , rv_sys_func_Present , aortic_valve_native_regurgitation_Present , lv_dias_func_Present ,Normal_or_No_Label , tricuspid_valve_native_regurgitation_Present |
reduced |
Normal_or_No_Label , Present |
Here, for the reduced labels Present
means that for any one or multiple of the pathologies we have a positive result.
Here, for the pathologies we have
View pathologies
Annotation | Pathology |
---|---|
pe | Pericardial Effusion |
wma | Wall Motion Abnormality |
lv_dil | Left Ventricle Dilation |
rv_dil | Right Ventricle Dilation |
lv_syst_func | Left Ventricle Systolic Dysfunction |
rv_syst_func | Right Ventricle Systolic Dysfunction |
lv_dias_func | Diastolic Dysfunction |
aortic_valve_native_stenosis | Aortic Stenosis |
mitral_valve_native_regurgitation | Mitral valve regurgitation |
tricuspid_valve_native_regurgitation | Tricuspid regurgitation |
aortic_valve_native_regurgitation | Aortic Regurgitation |
Note: lv_dias_func
should have been dias_func
..
Intended use
The model is developed for document classification of Dutch clinical echocardiogram reports. Since it is a domain-specific model trained on medical data, it is only meant to be used on medical NLP tasks for Dutch echocardiogram reports.
Data
The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht. The training data was anonymized before starting the training procedure.
Feature | Description |
---|---|
Name | Echocardiogram_SpanCategorizer_aortic_stenosis |
Version | 1.0.0 |
transformers | >=4.40.0 |
Default Pipeline | pipeline , text-classification |
Components | RobertaForSequenceClassification |
License | cc-by-sa-4.0 |
Author | Bram van Es |
Contact
If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
Usage
If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
References
Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930