libri-alpha-0.85-Temp-1-processor-change

This model is a distilled version of Wav2vec2 on the 30% of the Librispeech-clean.100 dataset. It achieves the following results on the evaluation set:

Loss: 78.4467
Wer: 0.1153

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Knowledge distillation of Wav2vec2-base-960h teacher model with 6 attention layers for student model.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP
alpha: 0.75(ignore name of repo)
temperature: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
493.9213	0.75	100	145.7981	0.1515
410.8468	1.49	200	119.1579	0.1498
368.5187	2.24	300	109.7572	0.1505
329.7762	2.99	400	99.2350	0.1439
323.7352	3.73	500	92.1173	0.1356
305.1129	4.48	600	89.3685	0.1314
294.2529	5.22	700	88.3937	0.1287
284.5355	5.97	800	87.0589	0.1292
284.2181	6.72	900	86.4474	0.1298
273.915	7.46	1000	84.6149	0.1265
267.7668	8.21	1100	84.1840	0.1264
262.1592	8.96	1200	83.8678	0.1253
262.5562	9.7	1300	83.2756	0.1207
262.9982	10.45	1400	81.8095	0.1218
256.2891	11.19	1500	82.1241	0.1204
251.4134	11.94	1600	80.8432	0.1207
250.0854	12.69	1700	81.1467	0.1203
250.0077	13.43	1800	80.9370	0.1196
239.0915	14.18	1900	80.5060	0.1201
240.9192	14.93	2000	80.4557	0.1190
241.1668	15.67	2100	80.6453	0.1203
244.9744	16.42	2200	80.0101	0.1192
232.4748	17.16	2300	79.4798	0.1170
237.3503	17.91	2400	79.5743	0.1175
237.9698	18.66	2500	79.3368	0.1178
235.8808	19.4	2600	79.5519	0.1174
230.8314	20.15	2700	79.0367	0.1166
229.5856	20.9	2800	79.1809	0.1172
233.1034	21.64	2900	78.9896	0.1167
231.6986	22.39	3000	78.7184	0.1154
222.0106	23.13	3100	78.7308	0.1160
225.1484	23.88	3200	78.6649	0.1159
232.4254	24.63	3300	78.5096	0.1154
230.9492	25.37	3400	78.4873	0.1153
228.3062	26.12	3500	78.5155	0.1147
225.5572	26.87	3600	78.5693	0.1148
227.7358	27.61	3700	78.5487	0.1149
221.2486	28.36	3800	78.4307	0.1151
231.5915	29.1	3900	78.4270	0.1153
231.7214	29.85	4000	78.4467	0.1153

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.7.1
Tokenizers 0.11.0

rohitp1
/

libri-alpha-0.85-Temp-1-processor-change

libri-alpha-0.85-Temp-1-processor-change

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train rohitp1/libri-alpha-0.85-Temp-1-processor-change

Evaluation results