Sharing training data & reproducing training

by xhluca - opened May 28

May 28

Congratulations on the paper and score! Since this was trained on public data, would it be possible for you to release the dataset you used to train on Huggingface? It'd also be great to have a training script to reproduce the training, similar to this training script recently released by LLM2Vec:

xhluca changed discussion title from Training data & running the training to Sharing training data & reproducing training May 28

ajinkya-tejankar

Jun 12

It would be great to also have access to the unidirectional models listed in the paper for research purposes. Unidirectional models are not far behind bi-directional ones so it would be great to explore them side-by-side.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment