whisper-small-dv / README.md
rajvs20's picture
End of training
34356ff verified
metadata
library_name: transformers
language:
  - dv
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_13_0
metrics:
  - wer
model-index:
  - name: Whisper Small Dv - Raj Vardhan
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 13
          type: mozilla-foundation/common_voice_13_0
          config: dv
          split: test
          args: dv
        metrics:
          - name: Wer
            type: wer
            value: 11.039051361407658

Whisper Small Dv - Raj Vardhan

This model is a fine-tuned version of openai/whisper-small on the Common Voice 13 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2938
  • Wer Ortho: 57.8522
  • Wer: 11.0391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_steps: 50
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.0596 1.6287 500 0.1694 61.1045 11.9414
0.021 3.2573 1000 0.1984 57.7547 11.6598
0.0168 4.8860 1500 0.2246 58.1378 11.6546
0.0079 6.5147 2000 0.2520 57.3508 11.2216
0.0056 8.1433 2500 0.2669 57.9079 11.2042
0.0049 9.7720 3000 0.2751 57.1419 11.1573
0.0047 11.4007 3500 0.2803 56.9956 11.0269
0.0044 13.0293 4000 0.2938 57.8522 11.0391

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.1