--- language: - kk - tr - ru - en language_details: eng_Latn, kaz_Cyrl, rus_Cyrl, tur_Latn metrics: - bleu - chrf pipeline_tag: translation inference: false datasets: - facebook/flores - issai/kazparc --- # Tilmash

Tilmash was fine-tuned using Facebook’s NLLB model to enable machine translation for four languages—Kazakh, Russian, English, and Turkish. Below are the BLEU | chrF results of evaluating Tilmash on the FLoRes and KazParC test datasets.

Pair FLoRes KazParC
EN↔KK 0.20 | 0.60 0.21 | 0.60
EN↔RU 0.28 | 0.60 0.38 | 0.68
EN↔TR 0.27 | 0.65 0.25 | 0.64
KK↔EN 0.32 | 0.63 0.32 | 0.62
KK↔RU 0.18 | 0.52 0.29 | 0.63
KK↔TR 0.14 | 0.54 0.16 | 0.55
RU↔EN 0.32 | 0.63 0.42 | 0.70
RU↔KK 0.13 | 0.54 0.22 | 0.62
RU↔TR 0.14 | 0.54 0.18 | 0.57
TR↔EN 0.36 | 0.66 0.38 | 0.66
TR↔KK 0.13 | 0.54 0.16 | 0.55
TR↔RU 0.19 | 0.53 0.24 | 0.57
## Model Sources - **Repository:** https://github.com/IS2AI/KazParC - **Paper:** KazParC: Kazakh Parallel Corpus for Machine Translation - **Demo:** Tilmash Demo ## How to Get Started with the Model

You can use this model with the Transformers pipeline for translation.

```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, TranslationPipeline model = AutoModelForSeq2SeqLM.from_pretrained('issai/tilmash') tokenizer = AutoTokenizer.from_pretrained("issai/tilmash") # for src_lang and tgt_lang choose from kaz_Cyrl (Kazakh), rus_Cyrl (Russian), eng_Latn (English), tur_Latn (Turkish) tilmash = TranslationPipeline(model = model, tokenizer = tokenizer, src_lang = "kaz_Cyrl", tgt_lang = "eng_Latn", max_length = 1000) print(tilmash("Қазақстан — Шығыс Еуропа мен Орталық Азияда орналасқан мемлекет.")) # [{'translation_text': 'Kazakhstan is a country located in Eastern Europe and Central Asia.'}] ```