Edit model card

Whisper-VAD-squeezeformer

This model is a fine-tuned version of openai/whisper-small on the Voice_Data_Collection_second_edition dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3883
  • Cer: 22.8316

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 20
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2500
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss
3.0316 0.7697 2500 115.4486 2.9813
1.6038 1.5394 5000 80.6874 1.5812
0.7245 2.3091 7500 46.9425 0.7872
0.4629 3.0788 10000 36.1561 0.6003
0.4269 3.8485 12500 32.9094 0.5316
0.3028 4.6182 15000 29.6888 0.4871
0.2258 5.3879 17500 28.8440 0.4676
0.1778 6.1576 20000 28.2770 0.4583
0.5123 6.9273 22500 0.4495 26.4774
0.3597 7.6970 25000 0.4196 25.0974
0.2481 8.4667 27500 0.4026 23.7473
0.1943 9.2365 30000 0.3942 23.6876
0.1547 10.0062 32500 0.3870 22.8782
0.1365 10.7759 35000 0.3849 22.8111
0.1263 11.5456 37500 0.3890 22.8204
0.0929 12.3153 40000 0.3883 22.8316

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
26
Safetensors
Model size
323M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jun-han/Whisper-VAD-squeezeformer

Finetuned
(1886)
this model