IDEA-CCNL
/

Randeng-MegatronT5-770M

Text2Text Generation

text-generation-inference

Model card Files Files and versions Community

roygan commited on Oct 20, 2022

Commit

9f257f1

•

1 Parent(s): 5a19561

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -25,13 +25,13 @@ Good at solving NLT tasks, Chinese T5-large.
 ## 模型信息 Model Information
-为了得到一个大规模的中文版的T5，我们使用了Megatron-LM的方法和悟道语料库(180G版本)用于预训练。具体地，我们在预训练阶段中使用了[封神框架](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen)大概花费了16张A100约14天。
-To get a large-scale Chinese T5, we use of Megatron-LM and WuDao Corpora (180 GB version) for pre-training. Specifically, we use the [fengshen framework](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen) in the pre-training phase which cost about 14 days with 16 A100 GPUs.
 ## 使用 Usage
-因为[transformers](https://github.com/huggingface/transformers)库中是没有 Zhouwenwang-Unified-1.3B相关的模型结构的，所以你可以在我们的[Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)中找到并且运行代码。
 Since there is no structure of Randeng-MegatronT5-770M in [transformers library](https://github.com/huggingface/transformers), you can find the structure of Randeng-MegatronT5-770M and run the codes in [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).

 ## 模型信息 Model Information
+为了得到一个大规模的中文版的T5，我们使用了Megatron-LM的方法和悟道语料库(180G版本)用于预训练。具体地，我们在预训练阶段中使用了[Megatron-LM](https://github.com/NVIDIA/Megatron-LM) 大概花费了16张A100约14天。
+To get a large-scale Chinese T5, we use of [Megatron-LM](https://github.com/NVIDIA/Megatron-LM) and WuDao Corpora (180 GB version) for pre-training. Specifically, in the pre-training phase which cost about 14 days with 16 A100 GPUs.
 ## 使用 Usage
+因为[transformers](https://github.com/huggingface/transformers)库中是没有Randeng-MegatronT5-770M相关的模型结构的，所以你可以在我们的[Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)中找到并且运行代码。
 Since there is no structure of Randeng-MegatronT5-770M in [transformers library](https://github.com/huggingface/transformers), you can find the structure of Randeng-MegatronT5-770M and run the codes in [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).