Whisper Small Mn - Erkhembayar Gantulga

This model is a fine-tuned version of openai/whisper-small on the Common Voice 17.0 dataset. It achieves the following results on the evaluation set:

Loss: 0.1561
Wer: 19.4492

Training and evaluation data

Datasets used for training:

For training, combined Common Voice 17.0 and Google Fleurs datasets:

from datasets import load_dataset, DatasetDict, concatenate_datasets
from datasets import Audio

common_voice = DatasetDict()

common_voice["train"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="train+validation+validated", use_auth_token=True)
common_voice["test"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="test", use_auth_token=True)

common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))

common_voice = common_voice.remove_columns(
    ["accent", "age", "client_id", "down_votes", "gender", "locale", "path", "segment", "up_votes", "variant"]
)

google_fleurs = DatasetDict()

google_fleurs["train"] = load_dataset("google/fleurs", "mn_mn", split="train+validation", use_auth_token=True)
google_fleurs["test"] = load_dataset("google/fleurs", "mn_mn", split="test", use_auth_token=True)

google_fleurs = google_fleurs.remove_columns(
    ["id", "num_samples", "path", "raw_transcription", "gender", "lang_id", "language", "lang_group_id"]
)
google_fleurs = google_fleurs.rename_column("transcription", "sentence")

dataset = DatasetDict()
dataset["train"] = concatenate_datasets([common_voice["train"], google_fleurs["train"]])
dataset["test"] = concatenate_datasets([common_voice["test"], google_fleurs["test"]])

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.4118	0.4912	500	0.4810	50.3500
0.283	0.9823	1000	0.3347	38.6233
0.1778	1.4735	1500	0.2738	33.5240
0.1412	1.9646	2000	0.2216	27.8363
0.0676	2.4558	2500	0.1967	24.3823
0.0602	2.9470	3000	0.1711	21.7428
0.0363	3.4381	3500	0.1624	20.4108
0.0332	3.9293	4000	0.1561	19.4492

Framework versions

Transformers 4.44.0
Pytorch 2.3.1+cu118
Datasets 2.20.0
Tokenizers 0.19.1

erkhem-gantulga
/

whisper-small-mn

Whisper Small Mn - Erkhembayar Gantulga

Training and evaluation data

Training hyperparameters

Training results

Framework versions

Model tree for erkhem-gantulga/whisper-small-mn

Datasets used to train erkhem-gantulga/whisper-small-mn

Evaluation results