This is a finetuned version of RuRoBERTa-large for the task of linguistic acceptability classification on the RuCoLA benchmark.
The hyperparameters used for finetuning are as follows:
- 5 training epochs (with early stopping based on validation MCC)
- Peak learning rate: 1e-5, linear warmup for 10% of total training time
- Weight decay: 1e-4
- Batch size: 32
- Random seed: 5
- Optimizer: torch.optim.AdamW
- Downloads last month
- 642
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.