Edit model card

Whisper-Small-squeezeformer-architecture

This model is a fine-tuned version of openai/whisper-small on the Voice_Data_Collection_second_edition dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5061
  • Cer: 28.1162

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 3750
  • training_steps: 120000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss
2.9532 1.0 3750 103.2182 2.9601
1.6561 2.0 7500 85.2058 1.6430
0.6747 3.0 11250 43.9073 0.7233
0.4521 4.0 15000 33.8818 0.5573
0.3412 5.0 18750 29.7393 0.4957
0.2109 6.0 22500 27.6988 0.4640
0.1365 7.0 26250 27.5348 0.4580
0.105 8.0 30000 27.2348 0.4571
0.4959 9.0 33750 24.2346 0.4091
0.344 10.0 37500 22.3133 0.3801
0.2431 11.0 41250 21.3667 0.3668
0.1569 12.0 45000 21.1207 0.3665
0.112 13.0 48750 21.1170 0.3702
0.0716 14.0 52500 21.1263 0.3761
0.052 15.0 56250 21.1822 0.3802
0.038 16.0 60000 21.0778 0.3833
0.2982 17.0 63750 24.5216 0.4189
0.1896 18.0 67500 24.6688 0.4229
0.155 19.0 71250 25.9154 0.4375
0.1105 20.0 75000 26.1372 0.4476
0.0727 21.0 78750 26.9087 0.4637
0.0511 22.0 82500 26.7894 0.4706
0.033 23.0 86250 27.2180 0.4808
0.0246 24.0 90000 27.2870 0.4840
0.2775 25.0 93750 26.0310 0.4465
0.1631 26.0 97500 26.6068 0.4500
0.1428 27.0 101250 26.9869 0.4609
0.0955 28.0 105000 27.1919 0.4799
0.0756 29.0 108750 27.6261 0.4870
0.0584 30.0 112500 27.9634 0.4959
0.0386 31.0 116250 0.5041 28.1907
0.0367 32.0 120000 0.5061 28.1162

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
83
Safetensors
Model size
323M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jun-han/Whisper-Small-architecture-change

Finetuned
(1717)
this model