KempnerInstituteAI
/

loss-to-loss

KempnerInstitute commited on 1 day ago

Commit

baac830

•

1 Parent(s): 1fa49a6

Update README with paper info

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Model description
-This repository contains over 500 model checkpoints ranging in size from 20M parameters up to 3.3B parameters and FLOP budgets from 2e17 to 1e21 FLOPs across 6 different pretraining datasets.
 Each subdirectory name contains four different parameters to identify the model in that subdirectory:
@@ -40,7 +40,12 @@ model = HFMixinOLMo.from_pretrained(f"{tmp_dir}/{model_name}")
 If you use these models in your research, please cite this paper:
 ```bibtex
-TODO
 ```
 # License

 # Model description
+This repository contains over 500 model checkpoints for the paper [Loss-to-Loss Prediction: Scaling Laws for All Datasets](https://arxiv.org/abs/2411.12925), with models ranging in size from 20M parameters up to 3.3B parameters and FLOP budgets from 2e17 to 1e21 FLOPs across 6 different pretraining datasets.
 Each subdirectory name contains four different parameters to identify the model in that subdirectory:
 If you use these models in your research, please cite this paper:
 ```bibtex
+@article{brandfonbrener2024loss,
+      title={Loss-to-Loss Prediction: Scaling Laws for All Datasets},
+      author={Brandfonbrener, David and Anand, Nikhil and Vyas, Nikhil and Malach, Eran and Kakade, Sham},
+      journal={arXiv preprint arXiv:2411.12925},
+      year={2024}
+}
 ```
 # License