metadata

license: cc-by-4.0
language:
  - es
tags:
  - biomedical
  - clinical
  - ner
metrics:
  - f1
widget:
  - text: >-
      Se realizó angiotomografía urgente de arterias pulmonares, que mostró
      tromboembolia pulmonar bilateral con dilatación ventricular derecha,
      además de opacidades periféricas parcheadas compatibles con neumonía por
      SARS-CoV-2, que se confirmó en la PCR.
    example_title: COVID-19
  - text: >-
      El paciente presenta HTA en tratamiento con IECA y alfa-bloqueante,
      artritis reumatoide en tratamiento con corticoesteroide oral.
    example_title: Oncology
  - text: >-
      Otros antecedentes de importancia son la captura de 30 insectos dentro de
      la vivienda, de los cuales tres fueron positivos a la infección por
      Trypanosoma cruzi y las características de la vivienda con materiales de
      construcción considerados de riesgo para la presencia del transmisor
    example_title: Tropical medicine
  - text: >-
      Tras la evaluación de la paciente por medio de exploración
      psicopatológica, la orientación diagnóstica es de trastorno adaptativo
      tipo mixto.
    example_title: Psychiatry
  - text: >-
      Los hallazgos descritos son compatibles con quiste braquial del segundo
      arco complicado con proceso inflamatorio - infeccioso, sin poder descartar
      proceso maligno subyacente.
    example_title: Otorhinolaryngology

Disease mention recognizer for Spanish clinical texts 🦠🔬

This model derives from participation of SINAI team in DISease TExt Mining Shared Task (DISTEMIST). The DISTEMIST-entities subtrack required automatically finding disease mentions in clinical cases. Taking into account the length of clinical texts in the dataset, we opted for a sentence-level NER approach based on fine-tuning of a RoBERTa model pre-trained on Spanish biomedical corpora.

Evaluation and results

Using the biomedical model on EHRs can be considered as cross-domain experiment and the fact that our biomedical system exhibits encouraging results on the NER task highlights the existence of domain transfer potential between biomedical and clinical fields. Table below summarizes the official micro-average scores obtained by this model during the official evaluation. Team standings are available here.

Precision	Recall	F1-score
0.7520	0.7259	0.7387

System description paper and citation

System description paper is published in proceedings of 10th BioASQ Workshop, which will be held as a Lab in CLEF 2022 on September 5-8, 2022:

@inproceedings{ChizhikovaEtAl:CLEF2022,
title = {SINAI at CLEF 2022: Leveraging biomedical transformers to detect and normalize disease mentions},
author = {Mariia Chizhikova and Jaime Collado-Montañéz and Pilar López-Úbeda and Manuel C. Díaz-Galiano and L. Alfonso Ureña-López and M. Teresa Martín-Valdivia},
pages = {265--273},
url = {http://ceur-ws.org/Vol-XXX/#paper-17},
crossref = {CLEF2022}}