metadata

library_name: transformers
language:
  - zh
license: mit
base_model: openai/whisper-large-v3-turbo
tags:
  - wft
  - whisper
  - automatic-speech-recognition
  - audio
  - speech
  - generated_from_trainer
datasets:
  - JacobLinCool/common_voice_16_1_zh_TW_clean_preprocessed
metrics:
  - wer
model-index:
  - name: whisper-large-v3-turbo-zh-TW-clean-1
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: JacobLinCool/common_voice_16_1_zh_TW_clean_preprocessed
          type: JacobLinCool/common_voice_16_1_zh_TW_clean_preprocessed
        metrics:
          - type: wer
            value: 40.07234726688103
            name: Wer

whisper-large-v3-turbo-zh-TW-clean-1

This model is a fine-tuned version of openai/whisper-large-v3-turbo on the JacobLinCool/common_voice_16_1_zh_TW_clean_preprocessed dataset. It achieves the following results on the evaluation set:

Loss: 0.2641
Wer: 40.0723
Cer: 11.4336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Cer	Validation Loss	Wer
No log	0	0	22.9952	2.8297	83.7420
2.0577	0.9987	377	14.2907	0.2666	47.9904
1.9482	2.0	755	14.4991	0.2770	47.9703
1.1107	2.9987	1132	15.0615	0.2886	48.4124
0.7225	4.0	1510	13.4020	0.2736	46.2420
0.5901	4.9987	1887	13.7309	0.2759	45.2572
0.4879	6.0	2265	12.9777	0.2740	44.9759
0.1874	6.9987	2642	12.7316	0.2663	44.2524
0.0544	8.0	3020	12.2295	0.2712	42.6648
0.0128	8.9987	3397	11.6068	0.2669	40.8963
0.004	9.9868	3770	11.4336	0.2641	40.0723
0.004	9.9868	3770	0.2641	40.0723	11.4336

Framework versions

PEFT 0.13.2
Transformers 4.46.0
Pytorch 2.4.0
Datasets 3.0.2
Tokenizers 0.20.1