Joint Laboratory of HIT and iFLYTEK Research (HFL)
commited on
Commit
•
1eb2cbf
1
Parent(s):
cccc38f
Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,50 @@ language:
|
|
3 |
- zh
|
4 |
license: "apache-2.0"
|
5 |
---
|
6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- zh
|
4 |
license: "apache-2.0"
|
5 |
---
|
6 |
+
## Chinese BERT with Whole Word Masking
|
7 |
+
For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**. Meanwhile, we also compare the state-of-the-art Chinese pre-trained models in depth, including [BERT](https://github.com/google-research/bert)、[ERNIE](https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE)、[BERT-wwm](https://github.com/ymcui/Chinese-BERT-wwm).
|
8 |
+
|
9 |
+
**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**
|
10 |
+
Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu
|
11 |
+
|
12 |
+
This repository is developed based on:https://github.com/google-research/bert
|
13 |
+
|
14 |
+
You may also interested in,
|
15 |
+
- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm
|
16 |
+
- Chinese MacBERT: https://github.com/ymcui/MacBERT
|
17 |
+
- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA
|
18 |
+
- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet
|
19 |
+
- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer
|
20 |
+
|
21 |
+
More resources by HFL: https://github.com/ymcui/HFL-Anthology
|
22 |
+
|
23 |
+
## Citation
|
24 |
+
If you find the technical report or resource is useful, please cite the following technical report in your paper.
|
25 |
+
- Primary: https://arxiv.org/abs/2004.13922
|
26 |
+
```
|
27 |
+
@inproceedings{cui-etal-2020-revisiting,
|
28 |
+
title = "Revisiting Pre-Trained Models for {C}hinese Natural Language Processing",
|
29 |
+
author = "Cui, Yiming and
|
30 |
+
Che, Wanxiang and
|
31 |
+
Liu, Ting and
|
32 |
+
Qin, Bing and
|
33 |
+
Wang, Shijin and
|
34 |
+
Hu, Guoping",
|
35 |
+
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings",
|
36 |
+
month = nov,
|
37 |
+
year = "2020",
|
38 |
+
address = "Online",
|
39 |
+
publisher = "Association for Computational Linguistics",
|
40 |
+
url = "https://www.aclweb.org/anthology/2020.findings-emnlp.58",
|
41 |
+
pages = "657--668",
|
42 |
+
}
|
43 |
+
```
|
44 |
+
- Secondary: https://arxiv.org/abs/1906.08101
|
45 |
+
```
|
46 |
+
@article{chinese-bert-wwm,
|
47 |
+
title={Pre-Training with Whole Word Masking for Chinese BERT},
|
48 |
+
author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},
|
49 |
+
journal={arXiv preprint arXiv:1906.08101},
|
50 |
+
year={2019}
|
51 |
+
}
|
52 |
+
```
|