Edit model card

roberta-urdu-small

License: MIT

Overview

Language model: roberta-urdu-small Model size: 125M Language: Urdu Training data: News data from urdu news resources in Pakistan

About roberta-urdu-small

roberta-urdu-small is a language model for urdu language.

from transformers import pipeline
fill_mask = pipeline("fill-mask", model="urduhack/roberta-urdu-small", tokenizer="urduhack/roberta-urdu-small")

Training procedure

roberta-urdu-small was trained on urdu news corpus. Training data was normalized using normalization module from urduhack to eliminate characters from other languages like arabic.

About Urduhack

Urduhack is a Natural Language Processing (NLP) library for urdu language. Github: https://github.com/urduhack/urduhack

Downloads last month
1,890
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for urduhack/roberta-urdu-small

Finetunes
6 models

Space using urduhack/roberta-urdu-small 1