Just Read Twice!
Collection
Here we provide models and benchmarks for the Just Read Twice work: https://arxiv.org/abs/2407.05483
•
16 items
•
Updated
This model is an Attention (Llama architecture) model pretrained on 30Bn tokens of the Pile corpus.
The model implementation and training code that produced the model are provided here: https://github.com/HazyResearch/based