Edit model card

MMS speech recognition for Ugandan languages

This is a fine-tuned version of facebook/mms-1b-all for Ugandan languages, trained with the SALT dataset. The languages supported are:

code language
lug Luganda
ach Acholi
lgg Lugbara
teo Ateso
nyn Runyankole
eng English (Ugandan)

For each language there are two adapters: one optimised for cases where the speech is only in that language, and another in which code-switching with English is expected.

Usage

Usage is the same as the base model, though with different adapters available.

import torch
import transformers
import datasets

# Available adapters:
# ['lug', 'lug+eng', 'ach', 'ach+eng', 'lgg', 'lgg+eng',
#  'nyn', 'nyn+eng', 'teo', 'teo+eng']
language = 'lug'

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = transformers.Wav2Vec2ForCTC.from_pretrained(
    'Sunbird/asr-mms-salt').to(device)
model.load_adapter(language)

processor = transformers.Wav2Vec2Processor.from_pretrained(
    'Sunbird/asr-mms-salt')
processor.tokenizer.set_target_lang(language)

# Get some test audio
ds = datasets.load_dataset('Sunbird/salt', 'multispeaker-lug', split='test')
audio = ds[0]['audio']
sample_rate = ds[0]['sample_rate']

# Apply the model
inputs = processor(audio, sampling_rate=sample_rate, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs.to(device)).logits

ids = torch.argmax(outputs, dim=-1)[0]
transcription = processor.decode(ids)

print(transcription)
# ekikola ky'akasooli kyakyenvu wabula langi yakyo etera okuba eyaakitaka wansi

The output of this model is unpunctuated and lower case. For applications requiring formatted text, an alternative model is Sunbird/asr-whisper-large-v2-salt.

Downloads last month
18,598
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Sunbird/asr-mms-salt

Finetuned
(133)
this model

Dataset used to train Sunbird/asr-mms-salt

Space using Sunbird/asr-mms-salt 1