---
license: apache-2.0
base_model: openai/whisper-large-v3
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-large-v3-myanmar
  results: []
datasets:
- chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
- my
pipeline_tag: automatic-speech-recognition
library_name: transformers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# whisper-large-v3-myanmar

This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) on the [chuuhtetnaing/myanmar-speech-dataset-openslr-80](https://huggingface.co/datasets/chuuhtetnaing/myanmar-speech-dataset-openslr-80) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1752
- Wer: 54.8976

## Usage

```python
from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-large-v3-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျမ ပြည်ပ မှာ ပညာသင် တော့ စာမေးပွဲ ကို တပတ်တခါ စစ်တယ်
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 20
- eval_batch_size: 20
- seed: 42
- gradient_accumulation_steps: 3
- total_train_batch_size: 60
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 30
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Wer     |
|:-------------:|:-----:|:----:|:---------------:|:-------:|
| 0.9771        | 1.0   | 42   | 0.7598          | 100.0   |
| 0.3477        | 2.0   | 84   | 0.2140          | 89.8931 |
| 0.2244        | 3.0   | 126  | 0.1816          | 79.0294 |
| 0.1287        | 4.0   | 168  | 0.1510          | 71.9947 |
| 0.1029        | 5.0   | 210  | 0.1575          | 77.8718 |
| 0.0797        | 6.0   | 252  | 0.1315          | 70.5254 |
| 0.0511        | 7.0   | 294  | 0.1143          | 70.5699 |
| 0.03          | 8.0   | 336  | 0.1154          | 68.1656 |
| 0.0211        | 9.0   | 378  | 0.1289          | 69.1897 |
| 0.0151        | 10.0  | 420  | 0.1318          | 66.7854 |
| 0.0113        | 11.0  | 462  | 0.1478          | 69.1451 |
| 0.0079        | 12.0  | 504  | 0.1484          | 66.2066 |
| 0.0053        | 13.0  | 546  | 0.1389          | 65.0935 |
| 0.0031        | 14.0  | 588  | 0.1479          | 64.3811 |
| 0.0014        | 15.0  | 630  | 0.1611          | 64.8264 |
| 0.001         | 16.0  | 672  | 0.1627          | 63.3571 |
| 0.0012        | 17.0  | 714  | 0.1546          | 65.0045 |
| 0.0006        | 18.0  | 756  | 0.1566          | 64.5147 |
| 0.0006        | 20.0  | 760  | 0.1581          | 64.6928 |
| 0.0002        | 21.0  | 798  | 0.1621          | 63.9804 |
| 0.0003        | 22.0  | 836  | 0.1664          | 60.8638 |
| 0.0002        | 23.0  | 874  | 0.1663          | 58.5040 |
| 0.0           | 24.0  | 912  | 0.1699          | 55.8326 |
| 0.0           | 25.0  | 950  | 0.1715          | 55.0312 |
| 0.0           | 26.0  | 988  | 0.1730          | 54.9866 |
| 0.0           | 27.0  | 1026 | 0.1740          | 54.8976 |
| 0.0           | 28.0  | 1064 | 0.1747          | 54.8976 |
| 0.0           | 29.0  | 1102 | 0.1751          | 54.8976 |
| 0.0           | 30.0  | 1140 | 0.1752          | 54.8976 |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.15.1