Whisper_impediment

This model is a fine-tuned version of openai/whisper-base on the speech_impediment_audio dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3906
  • Cer: 14.8734

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
0.0426 20.0 200 0.2896 15.5063
0.0016 40.0 400 0.3225 14.5570
0.0006 60.0 600 0.3447 13.9241
0.0003 80.0 800 0.3588 14.5570
0.0002 100.0 1000 0.3686 14.5570
0.0002 120.0 1200 0.3765 14.8734
0.0002 140.0 1400 0.3827 14.8734
0.0001 160.0 1600 0.3869 14.8734
0.0001 180.0 1800 0.3896 14.8734
0.0001 200.0 2000 0.3906 14.8734

Framework versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
18
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yoona-J/speech_impediment_audio

Finetuned
(373)
this model