|
--- |
|
license: mit |
|
library_name: transformers |
|
--- |
|
|
|
# Swahili-English Translation Model |
|
|
|
## Model Details |
|
|
|
- **Pre-trained Model**: Rogendo/sw-en |
|
- **Architecture**: Transformer |
|
- **Training Data**: Trained on 210,000 Swahili-English corpus pairs |
|
- **Base Model**: Helsinki-NLP/opus-mt-en-swc |
|
- **Training Method**: Fine-tuned with an emphasis on bidirectional translation between Swahili and English. |
|
|
|
### Model Description |
|
|
|
This Swahili-English translation model was developed to handle translations between Swahili, one of Africa's most spoken languages, and English. It was trained on a diverse dataset sourced from OPUS, leveraging the Transformer architecture for effective translation. |
|
|
|
- **Developed by:** Peter Rogendo, Frederick Kioko |
|
- **Model Type:** Transformer |
|
- **Languages:** Swahili, English |
|
- **License:** Distributed under the MIT License |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned on the following datasets: |
|
|
|
- **WikiMatrix:** |
|
- **Package**: WikiMatrix.en-sw in Moses format |
|
- **License**: [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/legalcode) |
|
- **Citation**: Holger Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, arXiv, July 2019. |
|
|
|
- **ParaCrawl:** |
|
- **Package**: ParaCrawl.en-sw in Moses format |
|
- **License**: [CC0](http://paracrawl.eu/download.html) |
|
- **Acknowledgement**: Please acknowledge the ParaCrawl project at [ParaCrawl](http://paracrawl.eu). |
|
|
|
- **TICO-19:** |
|
- **Package**: tico-19.en-sw in Moses format |
|
- **License**: [CC0](https://tico-19.github.io/LICENSE.md) |
|
- **Citation**: J. Tiedemann, 2012, Parallel Data, Tools, and Interfaces in OPUS. |
|
|
|
## Usage |
|
|
|
### Using a Pipeline as a High-Level Helper |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Initialize the translation pipeline |
|
translator = pipeline("translation", model="Bildad/Swahili-English_Translation") |
|
|
|
# Translate text |
|
translation = translator("Habari yako?")[0] |
|
translated_text = translation["translation_text"] |
|
|
|
print(translated_text) |
|
|