This is a pretrained-from-scratch BART base model (140M parameters).
Training was performed on a clean 50GB Romanian text corpus for 3M steps with these scripts. The model was trained with a maximum sequence length of 1024.
!! IMPORTANT !! This model was pretrained on the text corruption task, meaning this model is not usable in any downstream task without finetuning first!
- Downloads last month
- 10