--- base_model: google/mt5-small datasets: - syubraj/roman2nepali-transliteration language: - ne - en library_name: transformers license: apache-2.0 metrics: - bleu tags: - generated_from_trainer model-index: - name: romaneng2nep_v2 results: [] --- # romaneng2nep_v2 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an [syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration). It achieves the following results on the evaluation set: - Loss: 2.9652 - Gen Len: 5.1538 ## MOdel Usage ```python !pip install transformers ``` ```python from transformers import AutoTokenizer, MT5ForConditionalGeneration checkpoint = "syubraj/romaneng2nep_v3" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = MT5ForConditionalGeneration.from_pretrained(checkpoint) # Set max sequence length max_seq_len = 20 def translate(text): # Tokenize the input text with a max length of 20 inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len) # Generate translation translated = model.generate(**inputs) # Decode the translated tokens back to text translated_text = tokenizer.decode(translated[0], skip_special_tokens=True) return translated_text # Example usage source_text = "muskuraudai" # Example Romanized Nepali text translated_text = translate(source_text) print(f"Translated Text: {translated_text}") ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 24 - eval_batch_size: 24 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 4 ### Training results | Step | Training Loss | Validation Loss | Gen Len | |--------|---------------|-----------------|----------| | 1000 | 15.0703 | 5.6154 | 2.3840 | | 2000 | 6.0460 | 4.4449 | 4.6281 | | 3000 | 5.2580 | 3.9632 | 4.7790 | | 4000 | 4.8563 | 3.6188 | 5.0053 | | 5000 | 4.5602 | 3.3491 | 5.3085 | | 6000 | 4.3146 | 3.1572 | 5.2562 | | 7000 | 4.1228 | 3.0084 | 5.2197 | | 8000 | 3.9695 | 2.8727 | 5.2140 | | 9000 | 3.8342 | 2.7651 | 5.1834 | | 10000 | 3.7319 | 2.6661 | 5.1977 | | 11000 | 3.6485 | 2.5864 | 5.1536 | | 12000 | 3.5541 | 2.5080 | 5.1990 | | 13000 | 3.4959 | 2.4464 | 5.1775 | | 14000 | 3.4315 | 2.3931 | 5.1747 | | 15000 | 3.3663 | 2.3401 | 5.1625 | | 16000 | 3.3204 | 2.3034 | 5.1481 | | 17000 | 3.2417 | 2.2593 | 5.1663 | | 18000 | 3.2186 | 2.2283 | 5.1351 | | 19000 | 3.1822 | 2.1946 | 5.1573 | | 20000 | 3.1449 | 2.1690 | 5.1649 | | 21000 | 3.1067 | 2.1402 | 5.1624 | | 22000 | 3.0844 | 2.1258 | 5.1479 | | 23000 | 3.0574 | 2.1066 | 5.1518 | | 24000 | 3.0357 | 2.0887 | 5.1446 | | 25000 | 3.0136 | 2.0746 | 5.1559 | | 26000 | 2.9957 | 2.0609 | 5.1658 | | 27000 | 2.9865 | 2.0510 | 5.1791 | | 28000 | 2.9765 | 2.0456 | 5.1574 | | 29000 | 2.9675 | 2.0386 | 5.1620 | | 30000 | 2.9678 | 2.0344 | 5.1601 | | 31000 | 2.9652 | 2.0320 | 5.1538 | ### Framework versions - Transformers 4.45.1 - Pytorch 2.4.0 - Datasets 3.0.1 - Tokenizers 0.20.0 ### Citation If you find this model useful, please site the work. ``` @misc {yubraj_sigdel_2024, author = { {Yubraj Sigdel} }, title = { romaneng2nep_v3 (Revision dca017e) }, year = 2024, url = { https://huggingface.co/syubraj/romaneng2nep_v3 }, doi = { 10.57967/hf/3252 }, publisher = { Hugging Face } } ```