metadata
language:
- en
pipeline_tag: token-classification
Named Entity Recognition (NER) model to recognize disease entities.
PubMedBERT fine-tuned on the following datasets:
- NCBI Disease Corpus (train and dev sets)
- PHAEDRA (train, dev, test sets): entity type "Disorder"
- Corpus for Disease Names and Adverse Effects (train, dev, test sets): entity types "DISEASE", "ADVERSE"
- RareDis corpus (train, dev, test sets): entity types "DISEASE", "RAREDISEASE", "SYMPTOM"
- CoMAGC (train, dev, test sets): entity type "cancer_term"
- PGxCorpus (train, dev, test sets):
- miRNA-Test-Corpus (train, dev, test sets): entity type "Diseases"
- BC5CDR (train and dev sets): entity type "Disease"
- Mantra (train, dev, test sets): entity type "DISO"