Update README.md
#3
by
AdaptLLM
- opened
README.md
CHANGED
@@ -186,7 +186,7 @@ Except for the pre-training data, *Instruction Pre-Training* keeps all other set
|
|
186 |
Therefore, you can easily use any training framework, such as [OLMo](https://github.com/allenai/OLMo) (for pre-training from scratch) and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) (for continual pre-training), to train on the templified instruction-augmented corpora.
|
187 |
|
188 |
1. For general pre-training from scratch, we recommend setting M = 2 and mixing the instruction-augmented corpora with unchanged raw corpora.
|
189 |
-
2. For domain-adaptive continual pre-training, we recommend setting M = 3 and mixing the instruction-augmented corpora with general instructions from [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) at a 1:1 ratio (counted by tokens). Each example from OpenOrca is formulated as "{question} {response}", with a
|
190 |
|
191 |
Let's try our method in continual pre-training for a quick start---it works easily!
|
192 |
|
|
|
186 |
Therefore, you can easily use any training framework, such as [OLMo](https://github.com/allenai/OLMo) (for pre-training from scratch) and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) (for continual pre-training), to train on the templified instruction-augmented corpora.
|
187 |
|
188 |
1. For general pre-training from scratch, we recommend setting M = 2 and mixing the instruction-augmented corpora with unchanged raw corpora.
|
189 |
+
2. For domain-adaptive continual pre-training, we recommend setting M = 3 and mixing the instruction-augmented corpora with general instructions from [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) at a 1:1 ratio (counted by tokens). Each example from OpenOrca is formulated as "{question} {response}", with a white-space used to connect the question and response.
|
190 |
|
191 |
Let's try our method in continual pre-training for a quick start---it works easily!
|
192 |
|