koalazf99 commited on
Commit
8155b23
1 Parent(s): 931fef3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -15,7 +15,7 @@ pipeline_tag: text-generation
15
  <img src="prox-teaser.png">
16
  </p>
17
 
18
- [ArXiv](http://arxiv.org/abs/xxxx) | [Models](https://huggingface.co/gair-prox/FW-ProX-1.7B) | [Data](https://huggingface.co/datasets/gair-prox/FineWeb-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)
19
 
20
  **FW-ProX-1.7B** is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co/datasets/gair-prox/FineWeb-pro) for 50B tokens.
21
 
@@ -30,6 +30,10 @@ ProX models are evaluated over 10 language model benchmarks in zero-shot setting
30
 
31
  ### Citation
32
  ```
33
- @misc{TBD
 
 
 
 
34
  }
35
  ```
 
15
  <img src="prox-teaser.png">
16
  </p>
17
 
18
+ [ArXiv](https://arxiv.org/abs/2409.17115) | [Models](https://huggingface.co/gair-prox/FW-ProX-1.7B) | [Data](https://huggingface.co/datasets/gair-prox/FineWeb-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)
19
 
20
  **FW-ProX-1.7B** is a small language model. It was and trained on the [FineWeb-pro](https://huggingface.co/datasets/gair-prox/FineWeb-pro) for 50B tokens.
21
 
 
30
 
31
  ### Citation
32
  ```
33
+ @article{zhou2024programming,
34
+ title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
35
+ author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
36
+ journal={arXiv preprint arXiv:2409.17115},
37
+ year={2024}
38
  }
39
  ```