romaneng2nep_v2
This model is a fine-tuned version of google/mt5-small on an syubraj/roman2nepali-transliteration. It achieves the following results on the evaluation set:
- Loss: 2.9652
- Gen Len: 5.1538
MOdel Usage
!pip install transformers
from transformers import AutoTokenizer, MT5ForConditionalGeneration
checkpoint = "syubraj/romaneng2nep_v3"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)
# Set max sequence length
max_seq_len = 20
def translate(text):
# Tokenize the input text with a max length of 20
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
# Generate translation
translated = model.generate(**inputs)
# Decode the translated tokens back to text
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
# Example usage
source_text = "muskuraudai" # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Step | Training Loss | Validation Loss | Gen Len |
---|---|---|---|
1000 | 15.0703 | 5.6154 | 2.3840 |
2000 | 6.0460 | 4.4449 | 4.6281 |
3000 | 5.2580 | 3.9632 | 4.7790 |
4000 | 4.8563 | 3.6188 | 5.0053 |
5000 | 4.5602 | 3.3491 | 5.3085 |
6000 | 4.3146 | 3.1572 | 5.2562 |
7000 | 4.1228 | 3.0084 | 5.2197 |
8000 | 3.9695 | 2.8727 | 5.2140 |
9000 | 3.8342 | 2.7651 | 5.1834 |
10000 | 3.7319 | 2.6661 | 5.1977 |
11000 | 3.6485 | 2.5864 | 5.1536 |
12000 | 3.5541 | 2.5080 | 5.1990 |
13000 | 3.4959 | 2.4464 | 5.1775 |
14000 | 3.4315 | 2.3931 | 5.1747 |
15000 | 3.3663 | 2.3401 | 5.1625 |
16000 | 3.3204 | 2.3034 | 5.1481 |
17000 | 3.2417 | 2.2593 | 5.1663 |
18000 | 3.2186 | 2.2283 | 5.1351 |
19000 | 3.1822 | 2.1946 | 5.1573 |
20000 | 3.1449 | 2.1690 | 5.1649 |
21000 | 3.1067 | 2.1402 | 5.1624 |
22000 | 3.0844 | 2.1258 | 5.1479 |
23000 | 3.0574 | 2.1066 | 5.1518 |
24000 | 3.0357 | 2.0887 | 5.1446 |
25000 | 3.0136 | 2.0746 | 5.1559 |
26000 | 2.9957 | 2.0609 | 5.1658 |
27000 | 2.9865 | 2.0510 | 5.1791 |
28000 | 2.9765 | 2.0456 | 5.1574 |
29000 | 2.9675 | 2.0386 | 5.1620 |
30000 | 2.9678 | 2.0344 | 5.1601 |
31000 | 2.9652 | 2.0320 | 5.1538 |
Framework versions
- Transformers 4.45.1
- Pytorch 2.4.0
- Datasets 3.0.1
- Tokenizers 0.20.0
Citation
If you find this model useful, please site the work.
@misc {yubraj_sigdel_2024,
author = { {Yubraj Sigdel} },
title = { romaneng2nep_v3 (Revision dca017e) },
year = 2024,
url = { https://huggingface.co/syubraj/romaneng2nep_v3 },
doi = { 10.57967/hf/3252 },
publisher = { Hugging Face }
}
- Downloads last month
- 46
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for syubraj/romaneng2nep_v3
Base model
google/mt5-small