metadata
language:
- en
pipeline_tag: token-classification
Named Entity Recognition (NER) model to recognize gene and protein entities.
PubMedBERT fine-tuned on the following datasets:
- miRNA-Test-Corpus: entity type "Genes/Proteins"
- CellFinder: entity type "GeneProtein"
- CoMAGC: entity "Gene"
- CRAFT: entity type "PR"
- GREC Corpus: entity types "Gene", "Protein", "Protein_Complex", "Enzyme"
- JNLPBA: entity types "protein", "DNA", "RNA"
- PGxCorpus: entity type "Gene_or_protein"
- FSU_PRGE: entity types "protein", "protein_complex", "protein_familiy_or_group"
- BC2GM corpus- : entity type
- CHEMPROT: entity types "GENE-Y", "GENE-N"
- mTOR pathway event corpus: entity type "Protein"
- DNA Methylation
- BioNLP11ID: entity type "Gene/protein"
- BioNLP09
- BioNLP11EPI
- BioNLP13CG: entity type "gene_or_gene_product"
- BioNLP13GE: entity type "Protein"
- BioNLP13PC: entity type "Gene_or_gene_product"
- MLEE: entity type "Gene_or_gene_product"