werent4
/

mt5TranslatorLT

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

mt5TranslatorLT / README.md

werent4's picture

Update README.md

8700cdc verified 4 months ago

|

1.81 kB

	---
	library_name: transformers
	tags: []
	---

	# Model Card for Model ID

	This model is a translator into Lithuanian and vice versa.
	It was trained on the following datasets:
	* [ted_talks_iwslt](https://huggingface.co/datasets/IWSLT/ted_talks_iwslt)
	* [ayymen/Pontoon-Translations](https://huggingface.co/datasets/ayymen/Pontoon-Translations)

	Note This model is currently under development and only supports translation from English to Lithuanian.
	Other languages will also be added in the future.


	## Model Usage
	```Python
	import torch
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	from transformers import T5Tokenizer, MT5ForConditionalGeneration

	tokenizer = T5Tokenizer.from_pretrained('google/mt5-small')
	model = MT5ForConditionalGeneration.from_pretrained("werent4/mt5TranslatorLT")
	model.to(device)

	def translate(text, model, tokenizer, device):
	input_text = f"translate English to Lithuanian: {text}"
	encoded_input = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
	with torch.no_grad():
	output_tokens = model.generate(
	**encoded_input,
	max_length=128,
	num_beams=5,
	no_repeat_ngram_size=2,
	early_stopping=True
	)

	translated_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
	return translated_text

	text = "women"
	translate(text, model, tokenizer, device)
	`moteris`

	text = "How are you?"
	translate(text, model, tokenizer, device)
	`Kaip esate?`

	text = "I live in Kaunas"
	translate(text, model, tokenizer, device)
	`Aš gyvenu Kaunas`
	```



	## Model Card Authors

	[werent4](https://huggingface.co/werent4)
	[Mykhailo Shtopko](https://huggingface.co/BioMike)

	## Model Card Contact

	[More Information Needed]