manchuBERT
manchuBERT is a BERT-base model trained with romanized Manchu data from scratch.
ManNER & ManPOS are fine-tuned manchuBERT models.
Data
manchuBERT utilizes the data augmentation method from Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data.
Data | Number of Sentences(before augmentation) |
---|---|
Manwén Lˇaodàng–Taizong | 2,220 |
Ilan gurun i bithe | 41,904 |
Gin ping mei bithe | 21,376 |
Yùzhì Q¯ıngwénjiàn | 11,954 |
Yùzhì Zengdìng Q¯ıngwénjiàn | 18,420 |
Manwén Lˇaodàng–Taizu | 22,578 |
Manchu-Korean Dictionary | 40,583 |
Citation
@misc {jean_seo_2024,
author = { {Jean Seo} },
title = { manchuBERT (Revision 64133be) },
year = 2024,
url = { https://huggingface.co/seemdog/manchuBERT },
doi = { 10.57967/hf/1599 },
publisher = { Hugging Face }
}
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.