Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ pipeline_tag: text-generation
|
|
15 |
<img src="prox-teaser.png">
|
16 |
</p>
|
17 |
|
18 |
-
[ArXiv](
|
19 |
|
20 |
**FW-ProX-1.7B** is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co/datasets/gair-prox/FineWeb-pro) for 50B tokens.
|
21 |
|
@@ -30,6 +30,10 @@ ProX models are evaluated over 10 language model benchmarks in zero-shot setting
|
|
30 |
|
31 |
### Citation
|
32 |
```
|
33 |
-
@
|
|
|
|
|
|
|
|
|
34 |
}
|
35 |
```
|
|
|
15 |
<img src="prox-teaser.png">
|
16 |
</p>
|
17 |
|
18 |
+
[ArXiv](https://arxiv.org/abs/2409.17115) | [Models](https://huggingface.co/gair-prox/FW-ProX-1.7B) | [Data](https://huggingface.co/datasets/gair-prox/FineWeb-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)
|
19 |
|
20 |
**FW-ProX-1.7B** is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co/datasets/gair-prox/FineWeb-pro) for 50B tokens.
|
21 |
|
|
|
30 |
|
31 |
### Citation
|
32 |
```
|
33 |
+
@article{zhou2024programming,
|
34 |
+
title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
|
35 |
+
author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
|
36 |
+
journal={arXiv preprint arXiv:2409.17115},
|
37 |
+
year={2024}
|
38 |
}
|
39 |
```
|