--- license: mit library_name: transformers --- # Swahili-English Translation Model ## Model Details - **Pre-trained Model**: Rogendo/sw-en - **Architecture**: Transformer - **Training Data**: Trained on 210,000 Swahili-English corpus pairs - **Base Model**: Helsinki-NLP/opus-mt-en-swc - **Training Method**: Fine-tuned with an emphasis on bidirectional translation between Swahili and English. ### Model Description This Swahili-English translation model was developed to handle translations between Swahili, one of Africa's most spoken languages, and English. It was trained on a diverse dataset sourced from OPUS, leveraging the Transformer architecture for effective translation. - **Developed by:** Peter Rogendo, Frederick Kioko - **Model Type:** Transformer - **Languages:** Swahili, English - **License:** Distributed under the MIT License ### Training Data The model was fine-tuned on the following datasets: - **WikiMatrix:** - **Package**: WikiMatrix.en-sw in Moses format - **License**: [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/legalcode) - **Citation**: Holger Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, arXiv, July 2019. - **ParaCrawl:** - **Package**: ParaCrawl.en-sw in Moses format - **License**: [CC0](http://paracrawl.eu/download.html) - **Acknowledgement**: Please acknowledge the ParaCrawl project at [ParaCrawl](http://paracrawl.eu). - **TICO-19:** - **Package**: tico-19.en-sw in Moses format - **License**: [CC0](https://tico-19.github.io/LICENSE.md) - **Citation**: J. Tiedemann, 2012, Parallel Data, Tools, and Interfaces in OPUS. ## Usage ### Using a Pipeline as a High-Level Helper ```python from transformers import pipeline # Initialize the translation pipeline translator = pipeline("translation", model="Bildad/Swahili-English_Translation") # Translate text translation = translator("Habari yako?")[0] translated_text = translation["translation_text"] print(translated_text)