IDEA-CCNL
/

Erlangshen-UniMC-RoBERTa-110M-Chinese

@@ -12,14 +12,14 @@ tags:
 # Erlangshen-RoBERTa-110M-UniMC-Chinese
 - Paper: [Zero-Shot Learners for Nature Language Understanding via a Unified Multiple Choice Perspective](https://github.com/IDEA-CCNL/Fengshenbang-LM)
-- Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
 - Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
 ## 简介 Brief Introduction
-将自然语言理解任务转化为multiple choice任务，并且使用40个NLU 任务进行预训练
-Convert natural language understanding tasks into multiple choice tasks, and use 40 NLU task for pre-training
 ## 模型分类 Model Taxonomy
@@ -37,16 +37,41 @@ avoiding problems in commonly used large generative models such as FLAN. It not
 ### 下游效果 Performance
-**Zero-Shot Classification**
-| Model   | T0 11B | GLaM 60B | FLAN 137B | PaLM 540B | UniMC 235M |
-|---------|--------|----------|-----------|-----------|------------|
-| ANLI R1 | 43.6   | 40.9     | 47.7      | 48.4      | 52.0         |
-| ANLI R2 | 38.7   | 38.2     | 43.9      | 44.2      | 44.4       |
-| ANLI R3 | 41.3   | 40.9     | 47.0        | 45.7      | 47.8       |
-| CB      | 70.1   | 33.9     | 64.1      | 51.8      | 75.7       |
 ## 使用 Usage
 ```python3
 import argparse
@@ -57,6 +82,12 @@ total_parser = argparse.ArgumentParser("TASK NAME")
 total_parser = UniMCPiplines.piplines_args(total_parser)
 args = total_parser.parse_args()
 args.pretrained_model_path = 'IDEA-CCNL/Erlangshen-RoBERTa-110M-UniMC-Chinese'
 train_data = []
 dev_data = []
@@ -75,9 +106,6 @@ test_data = [
          "id": 7759}
     ]
-model = UniMCPiplines(args)
 if args.train:
     model.fit(train_data, dev_data)
 result = model.predict(test_data)

 # Erlangshen-RoBERTa-110M-UniMC-Chinese
 - Paper: [Zero-Shot Learners for Nature Language Understanding via a Unified Multiple Choice Perspective](https://github.com/IDEA-CCNL/Fengshenbang-LM)
+- Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/examples/unimc/)
 - Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
 ## 简介 Brief Introduction
+将自然语言理解任务转化为multiple choice任务，并且使用42个NLU 任务进行预训练
+Convert natural language understanding tasks into multiple choice tasks, and use 42 NLU task for pre-training
 ## 模型分类 Model Taxonomy
 ### 下游效果 Performance
+**Few-shot**
+| Model      | eprstmt    | csldcp   | tnews     | iflytek  | ocnli     | bustm     | chid      | csl      | wsc       | Avg       |
+|------------|------------|----------|-----------|----------|-----------|-----------|-----------|----------|-----------|-----------|
+| Finetuning | 65.4       | 35.5     | 49        | 32.8     | 33        | 60.7      | 14.9      | 50       | 55.6      | 44.1      |
+| PET        | 86.7       | 51.7     | 54.5      | 46       | 44        | 56        | 61.2      | 59.4     | 57.5      | 57.44     |
+| LM-BFF     | 85.6       | 54.4     | 53        | 47.1     | 41.6      | 57.6      | 61.2      | 51.7     | 54.7      | 56.32     |
+| P-tuning   | 88.3       | 56       | 54.2      | **57.6** | 41.9      | 60.9      | 59.3      | **62.9** | 58.1      | 59.91     |
+| EFL        | 84.9       | 45       | 52.1      | 42.7     | 66.2      | 71.8      | 30.9      | 56.6     | 53        | 55.91     |
+| [UniMC-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-RoBERTa-110M-UniMC-Chinese) | 88.64      | 54.08    | 54.32     | 48.6     | 66.55     | 73.76     | 67.71     | 52.54    | 59.92     | 62.86     |
+| [UniMC-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-RoBERTa-330M-UniMC-Chinese) | 89.53      | 57.3     | 54.25     | 50       | 70.59     | 77.49     | 78.09     | 55.73    | 65.16     | 66.46     |
+| [UniMC-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-MegatronBERT-1.3B-UniMC-Chinese) | **89.278** | **60.9** | **57.46** | 52.89    | **76.33** | **80.37** | **90.33** | 61.73    | **79.15** | **72.05** |
+**Zero-shot**
+| Model         | eprstmt   | csldcp    | tnews     | iflytek   | ocnli     | bustm    | chid     | csl      | wsc       | Avg       |
+|---------------|-----------|-----------|-----------|-----------|-----------|----------|----------|----------|-----------|-----------|
+| GPT-zero      | 57.5      | 26.2      | 37        | 19        | 34.4      | 50       | 65.6     | 50.1     | 50.3      | 43.4      |
+| PET-zero      | 85.2      | 12.6      | 26.1      | 26.6      | 40.3      | 50.6     | 57.6     | 52.2     | 54.7      | 45.1      |
+| NSP-BERT      | 86.9      | 47.6      | 51        | 41.6      | 37.4      | 63.4     | 52       | **64.4** | 59.4      | 55.96     |
+| ZeroPrompt    | -         | -         | -         | 16.14     | 46.16     | -        | -        | -        | 47.98     | -         |
+|  Yuan1.0-13B  | 88.13     | 38.99     | 57.47     | 38.82     | 48.13     | 59.38    | 86.14    | 50       | 38.99     | 56.22     |
+| ERNIE3.0-240B | 88.75     | **50.97** | **57.83** | **40.42** | 53.57     | 64.38    | 87.13    | 56.25    | 53.46     | 61.41     |
+| [UniMC-110M](https://huggingface.co/IDEA-CCNL/Erlangshen-RoBERTa-110M-UniMC-Chinese)    | 86.16     | 31.26     | 46.61     | 26.54     | 66.91     | 73.34    | 66.68    | 50.09    | 53.66     | 55.7      |
+| [UniMC-330M](https://huggingface.co/IDEA-CCNL/Erlangshen-RoBERTa-330M-UniMC-Chinese)     | 87.5      | 30.4      | 47.6      | 31.5      | 69.9      | 75.9     | 78.17    | 49.5     | 60.55     | 59.01     |
+| [UniMC-1.3B](https://huggingface.co/IDEA-CCNL/Erlangshen-MegatronBERT-1.3B-UniMC-Chinese)     | **88.79** | 42.06     | 55.21     | 33.93     | **75.57** | **79.5** | **89.4** | 50.25    | **66.67** | **64.53** |
 ## 使用 Usage
+```shell
+git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
+cd Fengshenbang-LM
+pip install --editable .
+```
 ```python3
 import argparse
 total_parser = UniMCPiplines.piplines_args(total_parser)
 args = total_parser.parse_args()
 args.pretrained_model_path = 'IDEA-CCNL/Erlangshen-RoBERTa-110M-UniMC-Chinese'
+args.learning_rate=2e-5
+args.max_length=512
+args.max_epochs=3
+args.batchsize=8
+args.default_root_dir='./'
+model = UniMCPiplines(args)
 train_data = []
 dev_data = []
          "id": 7759}
     ]
 if args.train:
     model.fit(train_data, dev_data)
 result = model.predict(test_data)