Text Generation
Transformers
PyTorch
English
mistral
text-generation-inference
Inference Endpoints
instruction-pretrain commited on
Commit
935d41e
1 Parent(s): cd3fa9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -26,6 +26,8 @@ We explore supervised multitask pre-training by proposing ***Instruction Pre-Tra
26
  - Domain-Specific Models Pre-Trained from Llama3-8B:
27
  - [Finance-Llama3-8B](https://huggingface.co/instruction-pretrain/finance-Llama3-8B)
28
  - [Biomedicine-Llama3-8B](https://huggingface.co/instruction-pretrain/medicine-Llama3-8B)
 
 
29
 
30
  ## General Pre-Training From Scratch
31
  We augment the [RefinedWeb corproa](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) with instruction-response pairs generated by our [context-based instruction synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer) to pre-train general langauge models from scratch.
 
26
  - Domain-Specific Models Pre-Trained from Llama3-8B:
27
  - [Finance-Llama3-8B](https://huggingface.co/instruction-pretrain/finance-Llama3-8B)
28
  - [Biomedicine-Llama3-8B](https://huggingface.co/instruction-pretrain/medicine-Llama3-8B)
29
+ - General Instruction-Augmented Corpora: [general-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/general-instruction-augmented-corpora)
30
+ - Domain-Specific Instruction-Augmented Corpora (no finance data to avoid ethical issues): [medicine-instruction-augmented-corpora](https://huggingface.co/datasets/instruction-pretrain/medicine-instruction-augmented-corpora)
31
 
32
  ## General Pre-Training From Scratch
33
  We augment the [RefinedWeb corproa](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) with instruction-response pairs generated by our [context-based instruction synthesizer](https://huggingface.co/instruction-pretrain/instruction-synthesizer) to pre-train general langauge models from scratch.