Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Hugging Face's logo

language:

  • om
  • am
  • rw
  • rn
  • ha
  • ig
  • pcm
  • so
  • sw
  • ti
  • yo
  • multilingual tags:
  • T5

afriteva_large

Model desription

AfriTeVa large is a sequence to sequence model pretrained on 10 African languages

Languages

Afaan Oromoo(orm), Amharic(amh), Gahuza(gah), Hausa(hau), Igbo(igb), Nigerian Pidgin(pcm), Somali(som), Swahili(swa), Tigrinya(tig), Yoruba(yor)

More information on the model, dataset:

The model

  • 745M parameters encoder-decoder architecture (T5-like)
  • 12 layers, 12 attention heads and 512 token sequence length

The dataset

  • Multilingual: 10 African languages listed above
  • 143 Million Tokens (1GB of text data)
  • Tokenizer Vocabulary Size: 70,000 tokens

Intended uses & limitations

afriteva_large is pre-trained model and primarily aimed at being fine-tuned on multilingual sequence-to-sequence tasks.

>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriteva_large")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("castorini/afriteva_large")

>>> src_text = "Ó hùn ọ́ láti di ara wa bí?"
>>> tgt_text =  "Would you like to be?"

>>> model_inputs = tokenizer(src_text, return_tensors="pt")
>>> with tokenizer.as_target_tokenizer():
        labels = tokenizer(tgt_text, return_tensors="pt").input_ids

>>> model(**model_inputs, labels=labels) # forward pass

Training Procedure

For information on training procedures, please refer to the AfriTeVa paper or repository

BibTex entry and Citation info

coming soon ...

Downloads last month
85
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.