Update README.md
Browse files
README.md
CHANGED
@@ -25,13 +25,13 @@ Good at solving NLT tasks, Chinese T5-large.
|
|
25 |
|
26 |
## 模型信息 Model Information
|
27 |
|
28 |
-
为了得到一个大规模的中文版的T5,我们使用了Megatron-LM的方法和悟道语料库(180G版本)用于预训练。具体地,我们在预训练阶段中使用了[
|
29 |
|
30 |
-
To get a large-scale Chinese T5, we use of Megatron-LM and WuDao Corpora (180 GB version) for pre-training. Specifically,
|
31 |
|
32 |
## 使用 Usage
|
33 |
|
34 |
-
因为[transformers](https://github.com/huggingface/transformers)库中是没有
|
35 |
|
36 |
Since there is no structure of Randeng-MegatronT5-770M in [transformers library](https://github.com/huggingface/transformers), you can find the structure of Randeng-MegatronT5-770M and run the codes in [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
|
37 |
|
|
|
25 |
|
26 |
## 模型信息 Model Information
|
27 |
|
28 |
+
为了得到一个大规模的中文版的T5,我们使用了Megatron-LM的方法和悟道语料库(180G版本)用于预训练。具体地,我们在预训练阶段中使用了[Megatron-LM](https://github.com/NVIDIA/Megatron-LM) 大概花费了16张A100约14天。
|
29 |
|
30 |
+
To get a large-scale Chinese T5, we use of [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) and WuDao Corpora (180 GB version) for pre-training. Specifically, in the pre-training phase which cost about 14 days with 16 A100 GPUs.
|
31 |
|
32 |
## 使用 Usage
|
33 |
|
34 |
+
因为[transformers](https://github.com/huggingface/transformers)库中是没有Randeng-MegatronT5-770M相关的模型结构的,所以你可以在我们的[Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)中找到并且运行代码。
|
35 |
|
36 |
Since there is no structure of Randeng-MegatronT5-770M in [transformers library](https://github.com/huggingface/transformers), you can find the structure of Randeng-MegatronT5-770M and run the codes in [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
|
37 |
|