suolyer commited on
Commit
d06c89e
1 Parent(s): 20900cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -17,9 +17,9 @@ tags:
17
 
18
  ## 简介 Brief Introduction
19
 
20
- 将自然语言理解任务转化为multiple choice任务,并且使用 48 NLU 任务进行预训练
21
 
22
- Convert natural language understanding tasks into multiple choice tasks, and use 48 NLU task for pre-training
23
 
24
  ## 模型分类 Model Taxonomy
25
 
@@ -37,14 +37,15 @@ avoiding problems in commonly used large generative models such as FLAN. It not
37
 
38
  ### 下游效果 Performance
39
 
 
40
  **Few-shot**
41
  | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg |
42
  |------------|------------|----------|-----------|----------|-----------|-----------|-----------|----------|-----------|-----------|
43
- | Finetuning | 65.4 | 35.5 | 49 | 32.8 | 33 | 60.7 | 14.9 | 50 | 55.6 | 44.1 |
44
- | PET | 86.7 | 51.7 | 54.5 | 46 | 44 | 56 | 61.2 | 59.4 | 57.5 | 57.44 |
45
- | LM-BFF | 85.6 | 54.4 | 53 | 47.1 | 41.6 | 57.6 | 61.2 | 51.7 | 54.7 | 56.32 |
46
- | P-tuning | 88.3 | 56 | 54.2 | **57.6** | 41.9 | 60.9 | 59.3 | **62.9** | 58.1 | 59.91 |
47
- | EFL | 84.9 | 45 | 52.1 | 42.7 | 66.2 | 71.8 | 30.9 | 56.6 | 53 | 55.91 |
48
  | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 88.64 | 54.08 | 54.32 | 48.6 | 66.55 | 73.76 | 67.71 | 52.54 | 59.92 | 62.86 |
49
  | [UniMC-RoBERTa-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 89.53 | 57.3 | 54.25 | 50 | 70.59 | 77.49 | 78.09 | 55.73 | 65.16 | 66.46 |
50
  | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **89.278** | **60.9** | **57.46** | 52.89 | **76.33** | **80.37** | **90.33** | 61.73 | **79.15** | **72.05** |
@@ -53,10 +54,10 @@ avoiding problems in commonly used large generative models such as FLAN. It not
53
 
54
  | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg |
55
  |---------------|-----------|-----------|-----------|-----------|-----------|----------|----------|----------|-----------|-----------|
56
- | GPT-zero | 57.5 | 26.2 | 37 | 19 | 34.4 | 50 | 65.6 | 50.1 | 50.3 | 43.4 |
57
- | PET-zero | 85.2 | 12.6 | 26.1 | 26.6 | 40.3 | 50.6 | 57.6 | 52.2 | 54.7 | 45.1 |
58
- | NSP-BERT | 86.9 | 47.6 | 51 | 41.6 | 37.4 | 63.4 | 52 | **64.4** | 59.4 | 55.96 |
59
- | ZeroPrompt | - | - | - | 16.14 | 46.16 | - | - | - | 47.98 | - |
60
  | Yuan1.0-13B | 88.13 | 38.99 | 57.47 | 38.82 | 48.13 | 59.38 | 86.14 | 50 | 38.99 | 56.22 |
61
  | ERNIE3.0-240B | 88.75 | **50.97** | **57.83** | **40.42** | 53.57 | 64.38 | 87.13 | 56.25 | 53.46 | 61.41 |
62
  | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 86.16 | 31.26 | 46.61 | 26.54 | 66.91 | 73.34 | 66.68 | 50.09 | 53.66 | 55.7 |
@@ -64,7 +65,6 @@ avoiding problems in commonly used large generative models such as FLAN. It not
64
  | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **88.79** | 42.06 | 55.21 | 33.93 | **75.57** | **79.5** | **89.4** | 50.25 | **66.67** | **64.53** |
65
 
66
 
67
-
68
  ## 使用 Usage
69
  ```shell
70
  git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
@@ -75,11 +75,11 @@ pip install --editable .
75
 
76
  ```python3
77
  import argparse
78
- from fengshen.pipelines.multiplechoice import UniMCPiplines
79
 
80
 
81
  total_parser = argparse.ArgumentParser("TASK NAME")
82
- total_parser = UniMCPiplines.piplines_args(total_parser)
83
  args = total_parser.parse_args()
84
  args.pretrained_model_path = 'IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese'
85
  args.learning_rate=2e-5
@@ -87,7 +87,7 @@ args.max_length=512
87
  args.max_epochs=3
88
  args.batchsize=8
89
  args.default_root_dir='./'
90
- model = UniMCPiplines(args)
91
 
92
  train_data = []
93
  dev_data = []
 
17
 
18
  ## 简介 Brief Introduction
19
 
20
+ UniMC 核心思想是将自然语言理解任务转化为 multiple choice 任务,并且使用多个 NLU 任务来进行预训练。我们在英文数据集实验结果表明仅含有 2.35 亿参数的 [ALBERT模型](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English)的zero-shot性能可以超越众多千亿的模型。并在中文测评基准 FewCLUE 和 ZeroCLUE 两个榜单中,13亿的[二郎神](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese)获得了第一的成绩。
21
 
22
+ The core idea of UniMC is to convert natural language understanding tasks into multiple choice tasks and use multiple NLU tasks for pre-training. Our experimental results on the English dataset show that the zero-shot performance of a [ALBERT](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English) model with only 235 million parameters can surpass that of many hundreds of billions of models. And in the Chinese evaluation benchmarks FewCLUE and ZeroCLUE two lists, 1.3 billion [Erlangshen](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) won the first result.
23
 
24
  ## 模型分类 Model Taxonomy
25
 
 
37
 
38
  ### 下游效果 Performance
39
 
40
+
41
  **Few-shot**
42
  | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg |
43
  |------------|------------|----------|-----------|----------|-----------|-----------|-----------|----------|-----------|-----------|
44
+ | [FineTuning](https://arxiv.org/pdf/2107.07498.pdf)-RoBERTa-110M | 65.4 | 35.5 | 49 | 32.8 | 33 | 60.7 | 14.9 | 50 | 55.6 | 44.1 |
45
+ | [FineTuning](https://arxiv.org/pdf/2107.07498.pdf)-ERNIE1.0-110M | 66.5 | 57 | 516 | 42.1 | 32 | 60.4 | 15 | 60.1 | 50.3 | 48.34 |
46
+ | [PET](https://arxiv.org/pdf/2107.07498.pdf)-ERNIE1.0-110M | 84 | 59.9 | 56.4 | 50.3 | 38.1 | 58.4 | 40.6 | 61.1 | 58.7 | 56.39 |
47
+ | [P-tuning](https://arxiv.org/pdf/2107.07498.pdf)-ERNIE1.0-110M | 80.6 | 56.6 | 55.9 | 52.6 | 35.7 | 60.8 | 39.61 | 51.8 | 55.7 | 54.37 |
48
+ | [EFL](https://arxiv.org/pdf/2107.07498.pdf)-ERNIE1.0-110M | 76.7 | 47.9 | 56.3 | 52.1 | 48.7 | 54.6 | 30.3 | 52.8 | 52.3 | 52.7 |
49
  | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 88.64 | 54.08 | 54.32 | 48.6 | 66.55 | 73.76 | 67.71 | 52.54 | 59.92 | 62.86 |
50
  | [UniMC-RoBERTa-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinese) | 89.53 | 57.3 | 54.25 | 50 | 70.59 | 77.49 | 78.09 | 55.73 | 65.16 | 66.46 |
51
  | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **89.278** | **60.9** | **57.46** | 52.89 | **76.33** | **80.37** | **90.33** | 61.73 | **79.15** | **72.05** |
 
54
 
55
  | Model | eprstmt | csldcp | tnews | iflytek | ocnli | bustm | chid | csl | wsc | Avg |
56
  |---------------|-----------|-----------|-----------|-----------|-----------|----------|----------|----------|-----------|-----------|
57
+ | [GPT](https://arxiv.org/pdf/2107.07498.pdf)-110M | 57.5 | 26.2 | 37 | 19 | 34.4 | 50 | 65.6 | 50.1 | 50.3 | 43.4 |
58
+ | [PET](https://arxiv.org/pdf/2107.07498.pdf)-RoBERTa-110M | 85.2 | 12.6 | 26.1 | 26.6 | 40.3 | 50.6 | 57.6 | 52.2 | 54.7 | 45.1 |
59
+ | [NSP-BERT](https://arxiv.org/abs/2109.03564)-110M | 86.9 | 47.6 | 51 | 41.6 | 37.4 | 63.4 | 52 | **64.4** | 59.4 | 55.96 |
60
+ | [ZeroPrompt](https://arxiv.org/abs/2201.06910)-T5-1.5B | - | - | - | 16.14 | 46.16 | - | - | - | 47.98 | - |
61
  | Yuan1.0-13B | 88.13 | 38.99 | 57.47 | 38.82 | 48.13 | 59.38 | 86.14 | 50 | 38.99 | 56.22 |
62
  | ERNIE3.0-240B | 88.75 | **50.97** | **57.83** | **40.42** | 53.57 | 64.38 | 87.13 | 56.25 | 53.46 | 61.41 |
63
  | [UniMC-RoBERTa-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese) | 86.16 | 31.26 | 46.61 | 26.54 | 66.91 | 73.34 | 66.68 | 50.09 | 53.66 | 55.7 |
 
65
  | [UniMC-MegatronBERT-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese) | **88.79** | 42.06 | 55.21 | 33.93 | **75.57** | **79.5** | **89.4** | 50.25 | **66.67** | **64.53** |
66
 
67
 
 
68
  ## 使用 Usage
69
  ```shell
70
  git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
 
75
 
76
  ```python3
77
  import argparse
78
+ from fengshen.pipelines.multiplechoice import UniMCPipelines
79
 
80
 
81
  total_parser = argparse.ArgumentParser("TASK NAME")
82
+ total_parser = UniMCPipelines.piplines_args(total_parser)
83
  args = total_parser.parse_args()
84
  args.pretrained_model_path = 'IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese'
85
  args.learning_rate=2e-5
 
87
  args.max_epochs=3
88
  args.batchsize=8
89
  args.default_root_dir='./'
90
+ model = UniMCPipelines(args)
91
 
92
  train_data = []
93
  dev_data = []