Dataset sources

#1
by MatVet - opened

Thanks for the interesting work! Could you elaborate on the datasets and techniques used to train this model? Moreover, are you planning on releasing a smaller model? llama3 8B for example

Writer org

Thanks for your interest in our models!

We developed a custom, high-quality dataset for training, focusing on real-world use cases. This approach helps ensure our model is well-suited for practical applications.
For the training process, we employed continued pre-training techniques, including RoPE scaling to enhance long-context support.
We don't have immediate plans for an 8B version.

Do you mind sharing the dataset or how you created it?

Sign up or log in to comment