KempnerInstitute commited on
Commit
baac830
1 Parent(s): 1fa49a6

Update README with paper info

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -1,6 +1,6 @@
1
  # Model description
2
 
3
- This repository contains over 500 model checkpoints ranging in size from 20M parameters up to 3.3B parameters and FLOP budgets from 2e17 to 1e21 FLOPs across 6 different pretraining datasets.
4
 
5
  Each subdirectory name contains four different parameters to identify the model in that subdirectory:
6
 
@@ -40,7 +40,12 @@ model = HFMixinOLMo.from_pretrained(f"{tmp_dir}/{model_name}")
40
  If you use these models in your research, please cite this paper:
41
 
42
  ```bibtex
43
- TODO
 
 
 
 
 
44
  ```
45
 
46
  # License
 
1
  # Model description
2
 
3
+ This repository contains over 500 model checkpoints for the paper [Loss-to-Loss Prediction: Scaling Laws for All Datasets](https://arxiv.org/abs/2411.12925), with models ranging in size from 20M parameters up to 3.3B parameters and FLOP budgets from 2e17 to 1e21 FLOPs across 6 different pretraining datasets.
4
 
5
  Each subdirectory name contains four different parameters to identify the model in that subdirectory:
6
 
 
40
  If you use these models in your research, please cite this paper:
41
 
42
  ```bibtex
43
+ @article{brandfonbrener2024loss,
44
+ title={Loss-to-Loss Prediction: Scaling Laws for All Datasets},
45
+ author={Brandfonbrener, David and Anand, Nikhil and Vyas, Nikhil and Malach, Eran and Kakade, Sham},
46
+ journal={arXiv preprint arXiv:2411.12925},
47
+ year={2024}
48
+ }
49
  ```
50
 
51
  # License