File size: 3,988 Bytes

---
license: apache-2.0
base_model: openai/whisper-small
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-small-myanmar
  results: []
datasets:
- chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
- my
pipeline_tag: automatic-speech-recognition
library_name: transformers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# whisper-small-myanmar

This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the [chuuhtetnaing/myanmar-speech-dataset-openslr-80](https://huggingface.co/datasets/chuuhtetnaing/myanmar-speech-dataset-openslr-80) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1904
- Wer: 49.0650

## Usage

```python
from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-small-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျွန်မ ပြည်ပ မှာ ပညာသင် တော့ စာမေးပွဲ ကို တပတ်တခါ စစ်တယ်
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 30
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Wer      |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| 1.2566        | 1.0   | 36   | 0.8893          | 215.0045 |
| 0.8862        | 2.0   | 72   | 0.6243          | 388.6465 |
| 0.3546        | 3.0   | 108  | 0.2046          | 316.8744 |
| 0.1839        | 4.0   | 144  | 0.1695          | 81.3001  |
| 0.1198        | 5.0   | 180  | 0.1385          | 63.8914  |
| 0.0969        | 6.0   | 216  | 0.1583          | 66.0285  |
| 0.084         | 7.0   | 252  | 0.1539          | 70.6589  |
| 0.0628        | 8.0   | 288  | 0.1603          | 61.3090  |
| 0.0565        | 9.0   | 324  | 0.1424          | 60.3295  |
| 0.0355        | 10.0  | 360  | 0.1457          | 58.1478  |
| 0.0299        | 11.0  | 396  | 0.1547          | 57.7916  |
| 0.0183        | 12.0  | 432  | 0.1543          | 54.3633  |
| 0.0131        | 13.0  | 468  | 0.1532          | 54.1407  |
| 0.011         | 14.0  | 504  | 0.1604          | 53.8736  |
| 0.0083        | 15.0  | 540  | 0.1630          | 54.0516  |
| 0.0042        | 16.0  | 576  | 0.1711          | 52.1371  |
| 0.0034        | 17.0  | 612  | 0.1670          | 52.5824  |
| 0.0022        | 18.0  | 648  | 0.1649          | 52.5378  |
| 0.0013        | 19.0  | 684  | 0.1802          | 52.1817  |
| 0.0014        | 20.0  | 720  | 0.1820          | 53.1612  |
| 0.002         | 21.0  | 756  | 0.1792          | 52.7159  |
| 0.0016        | 22.0  | 792  | 0.1796          | 50.7124  |
| 0.0004        | 23.0  | 828  | 0.1803          | 50.4007  |
| 0.0003        | 24.0  | 864  | 0.1804          | 49.4657  |
| 0.0001        | 25.0  | 900  | 0.1819          | 49.2431  |
| 0.0           | 26.0  | 936  | 0.1857          | 49.0205  |
| 0.0           | 27.0  | 972  | 0.1879          | 49.1541  |
| 0.0           | 28.0  | 1008 | 0.1893          | 49.1095  |
| 0.0           | 29.0  | 1044 | 0.1901          | 49.1095  |
| 0.0           | 30.0  | 1080 | 0.1904          | 49.0650  |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.15.1