Bildad's picture
Update README.md
58198ed verified
---
license: mit
library_name: transformers
---
# Swahili-English Translation Model
## Model Details
- **Pre-trained Model**: Rogendo/sw-en
- **Architecture**: Transformer
- **Training Data**: Trained on 210,000 Swahili-English corpus pairs
- **Base Model**: Helsinki-NLP/opus-mt-en-swc
- **Training Method**: Fine-tuned with an emphasis on bidirectional translation between Swahili and English.
### Model Description
This Swahili-English translation model was developed to handle translations between Swahili, one of Africa's most spoken languages, and English. It was trained on a diverse dataset sourced from OPUS, leveraging the Transformer architecture for effective translation.
- **Developed by:** Peter Rogendo, Frederick Kioko
- **Model Type:** Transformer
- **Languages:** Swahili, English
- **License:** Distributed under the MIT License
### Training Data
The model was fine-tuned on the following datasets:
- **WikiMatrix:**
- **Package**: WikiMatrix.en-sw in Moses format
- **License**: [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/legalcode)
- **Citation**: Holger Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, arXiv, July 2019.
- **ParaCrawl:**
- **Package**: ParaCrawl.en-sw in Moses format
- **License**: [CC0](http://paracrawl.eu/download.html)
- **Acknowledgement**: Please acknowledge the ParaCrawl project at [ParaCrawl](http://paracrawl.eu).
- **TICO-19:**
- **Package**: tico-19.en-sw in Moses format
- **License**: [CC0](https://tico-19.github.io/LICENSE.md)
- **Citation**: J. Tiedemann, 2012, Parallel Data, Tools, and Interfaces in OPUS.
## Usage
### Using a Pipeline as a High-Level Helper
```python
from transformers import pipeline
# Initialize the translation pipeline
translator = pipeline("translation", model="Bildad/Swahili-English_Translation")
# Translate text
translation = translator("Habari yako?")[0]
translated_text = translation["translation_text"]
print(translated_text)