metadata
license: mit
library_name: transformers
Swahili-English Translation Model
Model Details
- Pre-trained Model: Rogendo/sw-en
- Architecture: Transformer
- Training Data: Trained on 210,000 Swahili-English corpus pairs
- Base Model: Helsinki-NLP/opus-mt-en-swc
- Training Method: Fine-tuned with an emphasis on bidirectional translation between Swahili and English.
Model Description
This Swahili-English translation model was developed to handle translations between Swahili, one of Africa's most spoken languages, and English. It was trained on a diverse dataset sourced from OPUS, leveraging the Transformer architecture for effective translation.
- Developed by: Peter Rogendo, Frederick Kioko
- Model Type: Transformer
- Languages: Swahili, English
- License: Distributed under the MIT License
Training Data
The model was fine-tuned on the following datasets:
WikiMatrix:
- Package: WikiMatrix.en-sw in Moses format
- License: CC-BY-SA 4.0
- Citation: Holger Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, arXiv, July 2019.
ParaCrawl:
TICO-19:
- Package: tico-19.en-sw in Moses format
- License: CC0
- Citation: J. Tiedemann, 2012, Parallel Data, Tools, and Interfaces in OPUS.
Usage
Using a Pipeline as a High-Level Helper
from transformers import pipeline
# Initialize the translation pipeline
translator = pipeline("translation", model="Bildad/Swahili-English_Translation")
# Translate text
translation = translator("Habari yako?")[0]
translated_text = translation["translation_text"]
print(translated_text)