instruction-pretrain
/

instruction-synthesizer

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AdaptLLM commited on Aug 1

Commit

ce74dc9

•

1 Parent(s): f067870

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -186,7 +186,7 @@ Except for the pre-training data, *Instruction Pre-Training* keeps all other set
 Therefore, you can easily use any training framework, such as [OLMo](https://github.com/allenai/OLMo) (for pre-training from scratch) and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) (for continual pre-training), to train on the templified instruction-augmented corpora.
 1. For general pre-training from scratch, we recommend setting M = 2 and mixing the instruction-augmented corpora with unchanged raw corpora.
-2. For domain-adaptive continual pre-training, we recommend setting M = 3 and mixing the instruction-augmented corpora with general instructions from [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) at a 1:1 ratio (counted by tokens). Each example from OpenOrca is formulated as "{question} {response}", with a blank space used to connect the question and response.
 Let's try our method in continual pre-training for a quick start---it works easily!

 Therefore, you can easily use any training framework, such as [OLMo](https://github.com/allenai/OLMo) (for pre-training from scratch) and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) (for continual pre-training), to train on the templified instruction-augmented corpora.
 1. For general pre-training from scratch, we recommend setting M = 2 and mixing the instruction-augmented corpora with unchanged raw corpora.
+2. For domain-adaptive continual pre-training, we recommend setting M = 3 and mixing the instruction-augmented corpora with general instructions from [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) at a 1:1 ratio (counted by tokens). Each example from OpenOrca is formulated as "{question} {response}", with a white-space used to connect the question and response.
 Let's try our method in continual pre-training for a quick start---it works easily!