Edit model card

Swahili-English Translation Model

Model Details

  • Pre-trained Model: Rogendo/sw-en
  • Architecture: Transformer
  • Training Data: Trained on 210,000 Swahili-English corpus pairs
  • Base Model: Helsinki-NLP/opus-mt-en-swc
  • Training Method: Fine-tuned with an emphasis on bidirectional translation between Swahili and English.

Model Description

This Swahili-English translation model was developed to handle translations between Swahili, one of Africa's most spoken languages, and English. It was trained on a diverse dataset sourced from OPUS, leveraging the Transformer architecture for effective translation.

  • Developed by: Peter Rogendo, Frederick Kioko
  • Model Type: Transformer
  • Languages: Swahili, English
  • License: Distributed under the MIT License

Training Data

The model was fine-tuned on the following datasets:

  • WikiMatrix:

    • Package: WikiMatrix.en-sw in Moses format
    • License: CC-BY-SA 4.0
    • Citation: Holger Schwenk et al., WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia, arXiv, July 2019.
  • ParaCrawl:

    • Package: ParaCrawl.en-sw in Moses format
    • License: CC0
    • Acknowledgement: Please acknowledge the ParaCrawl project at ParaCrawl.
  • TICO-19:

    • Package: tico-19.en-sw in Moses format
    • License: CC0
    • Citation: J. Tiedemann, 2012, Parallel Data, Tools, and Interfaces in OPUS.

Usage

Using a Pipeline as a High-Level Helper

from transformers import pipeline

# Initialize the translation pipeline
translator = pipeline("translation", model="Bildad/Swahili-English_Translation")

# Translate text
translation = translator("Habari yako?")[0]
translated_text = translation["translation_text"]

print(translated_text)
Downloads last month
92
Safetensors
Model size
74.4M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using Bildad/Swahili-English_Translation 2