Sharing training data & reproducing training
#4
by
xhluca
- opened
Congratulations on the paper and score! Since this was trained on public data, would it be possible for you to release the dataset you used to train on Huggingface? It'd also be great to have a training script to reproduce the training, similar to this training script recently released by LLM2Vec:
xhluca
changed discussion title from
Training data & running the training
to Sharing training data & reproducing training
It would be great to also have access to the unidirectional models listed in the paper for research purposes. Unidirectional models are not far behind bi-directional ones so it would be great to explore them side-by-side.