Hungarian Named Entity Recognition Model with huBERT
For further models, scripts and details, see our demo site.
- Pretrained model used: SZTAKI-HLT/hubert-base-cc
- Finetuned on NYTK-NerKor
- NE categories are: PER, LOC, MISC, ORG
Limitations
- max_seq_length = 128
Results
F-score: 90.18%
Usage with pipeline
from transformers import pipeline
ner = pipeline(task="ner", model="NYTK/named-entity-recognition-nerkor-hubert-hungarian")
input_text = "A Kovácsné Nagy Erzsébet nagyon jól érzi magát a Nokiánál, azonban a Németországból érkezett Kovács Péter nehezen boldogul a beilleszkedéssel."
print(ner(input_text, aggregation_strategy="simple"))
Citation
If you use this model, please cite the following paper:
@inproceedings {yang-language-models,
title = {Training language models with low resources: RoBERTa, BART and ELECTRA experimental models for Hungarian},
booktitle = {Proceedings of 12th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2021)},
year = {2021},
publisher = {IEEE},
address = {Online},
author = {Yang, Zijian Győző and Váradi, Tamás},
pages = {279--285}
}
- Downloads last month
- 318
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.