Update for paper release
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ license: apache-2.0
|
|
11 |
|
12 |
**Falcon-RW-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
|
13 |
|
14 |
-
|
15 |
|
16 |
RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-7B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
|
17 |
|
@@ -62,7 +62,7 @@ for seq in sequences:
|
|
62 |
|
63 |
### Model Source
|
64 |
|
65 |
-
- **Paper:**
|
66 |
|
67 |
## Uses
|
68 |
|
@@ -146,7 +146,7 @@ Training happened in early January 2023 and took about five days.
|
|
146 |
|
147 |
## Evaluation
|
148 |
|
149 |
-
|
150 |
|
151 |
|
152 |
## Technical Specifications
|
@@ -178,7 +178,17 @@ Falcon-RW-7B was trained a custom distributed training codebase, Gigatron. It us
|
|
178 |
|
179 |
## Citation
|
180 |
|
181 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
182 |
|
183 |
|
184 |
## Contact
|
|
|
11 |
|
12 |
**Falcon-RW-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
|
13 |
|
14 |
+
See the π [paper on arXiv](https://arxiv.org/abs/2306.01116) for more details.
|
15 |
|
16 |
RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-7B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
|
17 |
|
|
|
62 |
|
63 |
### Model Source
|
64 |
|
65 |
+
- **Paper:** [https://arxiv.org/abs/2306.01116](https://arxiv.org/abs/2306.01116).
|
66 |
|
67 |
## Uses
|
68 |
|
|
|
146 |
|
147 |
## Evaluation
|
148 |
|
149 |
+
See the π [paper on arXiv](https://arxiv.org/abs/2306.01116) for in-depth evaluation results.
|
150 |
|
151 |
|
152 |
## Technical Specifications
|
|
|
178 |
|
179 |
## Citation
|
180 |
|
181 |
+
```
|
182 |
+
@article{refinedweb,
|
183 |
+
title={The {R}efined{W}eb dataset for {F}alcon {LLM}: outperforming curated corpora with web data, and web data only},
|
184 |
+
author={Guilherme Penedo and Quentin Malartic and Daniel Hesslow and Ruxandra Cojocaru and Alessandro Cappelli and Hamza Alobeidli and Baptiste Pannier and Ebtesam Almazrouei and Julien Launay},
|
185 |
+
journal={arXiv preprint arXiv:2306.01116},
|
186 |
+
eprint={2306.01116},
|
187 |
+
eprinttype = {arXiv},
|
188 |
+
url={https://arxiv.org/abs/2306.01116},
|
189 |
+
year={2023}
|
190 |
+
}
|
191 |
+
```
|
192 |
|
193 |
|
194 |
## Contact
|