Whisper-Small-squeezeformer-architecture

This model is a fine-tuned version of openai/whisper-small on the Voice_Data_Collection_second_edition dataset. It achieves the following results on the evaluation set:

Loss: 0.5061
Cer: 28.1162

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 3750
training_steps: 120000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Cer	Validation Loss
2.9532	1.0	3750	103.2182	2.9601
1.6561	2.0	7500	85.2058	1.6430
0.6747	3.0	11250	43.9073	0.7233
0.4521	4.0	15000	33.8818	0.5573
0.3412	5.0	18750	29.7393	0.4957
0.2109	6.0	22500	27.6988	0.4640
0.1365	7.0	26250	27.5348	0.4580
0.105	8.0	30000	27.2348	0.4571
0.4959	9.0	33750	24.2346	0.4091
0.344	10.0	37500	22.3133	0.3801
0.2431	11.0	41250	21.3667	0.3668
0.1569	12.0	45000	21.1207	0.3665
0.112	13.0	48750	21.1170	0.3702
0.0716	14.0	52500	21.1263	0.3761
0.052	15.0	56250	21.1822	0.3802
0.038	16.0	60000	21.0778	0.3833
0.2982	17.0	63750	24.5216	0.4189
0.1896	18.0	67500	24.6688	0.4229
0.155	19.0	71250	25.9154	0.4375
0.1105	20.0	75000	26.1372	0.4476
0.0727	21.0	78750	26.9087	0.4637
0.0511	22.0	82500	26.7894	0.4706
0.033	23.0	86250	27.2180	0.4808
0.0246	24.0	90000	27.2870	0.4840
0.2775	25.0	93750	26.0310	0.4465
0.1631	26.0	97500	26.6068	0.4500
0.1428	27.0	101250	26.9869	0.4609
0.0955	28.0	105000	27.1919	0.4799
0.0756	29.0	108750	27.6261	0.4870
0.0584	30.0	112500	27.9634	0.4959
0.0386	31.0	116250	0.5041	28.1907
0.0367	32.0	120000	0.5061	28.1162

Framework versions

Transformers 4.44.2
Pytorch 2.4.0
Datasets 2.21.0
Tokenizers 0.19.1

jun-han
/

Whisper-Small-architecture-change

Whisper-Small-squeezeformer-architecture

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for jun-han/Whisper-Small-architecture-change

Evaluation results