Training dataset + Hyperparamters

#1
by Viewegger - opened

Hello,

Thank you for making this public, it looks that there is recent rise in non-English models.

  1. Do you plan to make the dataset public? If not, would it be possible to make public at least small portion of it, to see how similar dataset could be modeled in different languages?

  2. Could you provide some details on training procedure? Hyper-parameters and you HW setup + total time it took you to finish training?

Danke!

I would also love to see the data public - would like to reproduce it with different models. Thanks

VAGO solutions org

Hey @ all.
We are already planning to publish a partial data set that we used for the training. This is data that has been completely augmented from an existing English top dataset.
I think the dataset should make our approach clearer for the open source community.

Best Regards,
David

DaryoushV changed discussion status to closed

Sign up or log in to comment