Text Generation
Transformers
PyTorch
English
mistral
text-generation-inference
Inference Endpoints
instruction-pretrain commited on
Commit
bbcfb36
•
1 Parent(s): db13bd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -17,15 +17,17 @@ We explore supervised multitask pre-training by proposing ***Instruction Pre-Tra
17
  </p>
18
 
19
  **************************** **Updates** ****************************
 
20
  * 2024/7/15: We scaled up the pre-trained tokens from 100B to 250B, with the number of synthesized instruction-response pairs reaching 500M! Below, we show the performance trend on downstream tasks throughout the pre-training process:
21
- <p align='center'>
22
- <img src="https://cdn-uploads.huggingface.co/production/uploads/66711d2ee12fa6cc5f5dfc89/0okCfRkC6uALTfuNxt0Fa.png" width="700">
23
  </p>
24
  * 2024/6/21: Released the [paper](https://huggingface.co/papers/2406.14491), [code](https://github.com/microsoft/LMOps), and [resources](https://huggingface.co/instruction-pretrain)
25
 
26
  ## Resources
27
- **🤗 We share our data and models with example usages, feel free to open any issues or discussions! 🤗**
28
 
 
29
  - Context-Based Instruction Synthesizer: [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
30
  - Fine-Tuning Data for the Synthesizer: [ft-instruction-synthesizer-collection](https://huggingface.co/datasets/instruction-pretrain/ft-instruction-synthesizer-collection)
31
  - General Models Pre-Trained from Scratch (on 100B tokes):
@@ -82,7 +84,7 @@ Instruction Pre-Training
82
  }
83
  ```
84
 
85
- [AdaptLLM](https://huggingface.co/papers/2309.09530)
86
  ```bibtex
87
  @inproceedings{
88
  cheng2024adapting,
 
17
  </p>
18
 
19
  **************************** **Updates** ****************************
20
+ * 2024/7/31: Updated pre-training suggestions in the `Advanced Usage` section of [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
21
  * 2024/7/15: We scaled up the pre-trained tokens from 100B to 250B, with the number of synthesized instruction-response pairs reaching 500M! Below, we show the performance trend on downstream tasks throughout the pre-training process:
22
+ <p align='left'>
23
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/66711d2ee12fa6cc5f5dfc89/0okCfRkC6uALTfuNxt0Fa.png" width="500">
24
  </p>
25
  * 2024/6/21: Released the [paper](https://huggingface.co/papers/2406.14491), [code](https://github.com/microsoft/LMOps), and [resources](https://huggingface.co/instruction-pretrain)
26
 
27
  ## Resources
28
+ **🤗 We share our data and models with example usages, feel free to open any discussions at [this page](https://huggingface.co/papers/2406.14491)! 🤗**
29
 
30
+ - Thanks to the demo [davanstrien/instruction-synthesizer](https://huggingface.co/spaces/davanstrien/instruction-synthesizer) for implementing our approach
31
  - Context-Based Instruction Synthesizer: [instruction-synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer)
32
  - Fine-Tuning Data for the Synthesizer: [ft-instruction-synthesizer-collection](https://huggingface.co/datasets/instruction-pretrain/ft-instruction-synthesizer-collection)
33
  - General Models Pre-Trained from Scratch (on 100B tokes):
 
84
  }
85
  ```
86
 
87
+ [Adapt LLM to Domains](https://huggingface.co/papers/2309.09530)
88
  ```bibtex
89
  @inproceedings{
90
  cheng2024adapting,