Taishi-N324
commited on
Commit
•
518e312
1
Parent(s):
47ea8db
Upload README.md
Browse files
README.md
CHANGED
@@ -109,7 +109,7 @@ The following datasets were used for continual pre-training.
|
|
109 |
- [Algebraic Stack](https://huggingface.co/datasets/EleutherAI/proof-pile-2)
|
110 |
- [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
|
111 |
- [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)
|
112 |
-
- [Swallow Corpus](https://
|
113 |
- [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
|
114 |
- [The Vault](https://github.com/FSoft-AI4Code/TheVault)
|
115 |
|
|
|
109 |
- [Algebraic Stack](https://huggingface.co/datasets/EleutherAI/proof-pile-2)
|
110 |
- [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
|
111 |
- [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)
|
112 |
+
- [Swallow Corpus](https://arxiv.org/abs/2404.17733)
|
113 |
- [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
|
114 |
- [The Vault](https://github.com/FSoft-AI4Code/TheVault)
|
115 |
|