T5 for belarusian language
This model is based on T5-small with sequence length equal 128 tokens. Model trained from scratch on RTX 3090 24GB.
Supported tasks:
- translation BE to RU:
<extra_id_1>
- translation BE to EN:
<extra_id_2>
- translation RU to BE:
<extra_id_3>
- translation RU to EN:
<extra_id_5>
- translation EN to BE:
<extra_id_6>
- translation EN to RU:
<extra_id_7>
Metrics:
How to Get Started with the Model
Click to expand
from transformers import T5TokenizerFast, T5ForConditionalGeneration
tokenizer = T5TokenizerFast.from_pretrained("WelfCrozzo/T5-L128-belarusian")
model = T5ForConditionalGeneration.from_pretrained("WelfCrozzo/T5-L128-belarusian")
x = tokenizer.encode('<extra_id_1>да зорак праз цяжкасці', return_tensors='pt')
result = model.generate(x, return_dict_in_generate=True, output_scores=True,max_length=128)
print(tokenizer.decode(result["sequences"][0]))
References
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.