metadata

language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_sextusdecimus_virtual_tsfix_small
    results: []

scream_sextusdecimus_virtual_tsfix_small

This model is a fine-tuned version of openai/whisper-small on the NbAiLab/ncc_speech dataset. It achieves the following results on the evaluation set:

step: 19999
eval_loss: 0.2913
train_loss: 0.6610
eval_wer: 8.7151
eval_cer: 3.8962
eval_exact_wer: 8.7151
eval_exact_cer: 3.8962

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
lr_scheduler_type: linear
per_device_train_batch_size: 32
total_train_batch_size_per_node: 128
total_train_batch_size: 1024
total_optimization_steps: 20,000
starting_optimization_step: None
finishing_optimization_step: 20,000
num_train_dataset_workers: 32
num_hosts: 8
total_num_training_examples: 20,480,000
steps_per_epoch: 11920
num_beams: 5
dropout: True
bpe_dropout_probability: 0.1
activation_dropout_probability: 0.1

Training results

step	eval_loss	train_loss	eval_wer	eval_cer	eval_exact_wer	eval_exact_cer
0	1.2807	3.0725	196.6092	157.4275	196.6092	157.4275
1000	0.5902	1.0592	15.1695	4.8382	15.1695	4.8382
2000	0.4240	0.8640	11.3623	3.9308	11.3623	3.9308
3000	0.4213	0.7930	9.4587	3.3537	9.4587	3.3537
4000	0.4353	0.7986	9.3694	3.5263	9.3694	3.5263
5000	0.4697	0.7580	9.7858	4.1478	9.7858	4.1478
6000	0.4535	0.7003	10.0238	4.2119	10.0238	4.2119
7000	0.4608	0.7296	8.8638	3.4228	8.8638	3.4228
8000	0.3902	0.7053	8.9233	3.6003	8.9233	3.6003
9000	0.3575	0.7124	9.3992	3.9702	9.3992	3.9702
10000	0.3648	0.6858	8.8043	3.4326	8.8043	3.4326
11000	0.3033	0.6916	9.1315	3.7236	9.1315	3.7236
12000	0.3021	0.7028	8.9827	3.6052	8.9827	3.6052
13000	0.2959	0.6567	8.6556	3.4918	8.6556	3.4918
14000	0.3055	0.6828	8.9827	3.6496	8.9827	3.6496
15000	0.2930	0.6707	8.8043	3.7976	8.8043	3.7976
16000	0.2822	0.6523	8.5068	3.5806	8.5068	3.5806
17000	0.2809	0.6581	8.6853	3.7828	8.6853	3.7828
18000	0.2927	0.6455	9.1315	4.2513	9.1315	4.2513
19000	0.2922	0.6369	9.1017	4.1034	9.1017	4.1034
19999	0.2913	0.6610	8.7151	3.8962	8.7151	3.8962

Framework versions

Transformers 4.30.0.dev0
Datasets 2.12.1.dev0
Tokenizers 0.13.3