bert-tiny-ita is an italian foundational model (based on bert-tiny) pretrained from scratch on 20k italian Wikipedia articles and on a wide collection of italian words and dictionary definitions. It uses 512 context window size.
The project is still a work in progress, new versions will come with time.
Use it as a foundational model to be finetuned for specific italian tasks.
Training
- epochs: 250
- lr: 1e-5
- optim: AdamW
- weight_decay: 1e-4
Eval
- perplexity: 45 (it's a 12MB model!)
- Downloads last month
- 3,948
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.