metadata

language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_non_large_1e06_verbosity6
    results: []

scream_non_large_1e06_verbosity6

This model is a fine-tuned version of NbAiLab/scream_non_large_1e06_beams5_constantlr_long on the NbAiLab/NCC_speech_all_v5 dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
lr_scheduler_type: linear
per_device_train_batch_size: 5
total_train_batch_size_per_node: 20
total_train_batch_size: 160
total_optimization_steps: 25,000
starting_optimization_step: None
finishing_optimization_step: 25,000
num_train_dataset_workers: 32
num_hosts: 8
total_num_training_examples: 4,000,000
steps_per_epoch: 3948
num_beams: 5

Training results

step	eval_loss	train_loss	eval_wer	eval_cer
0	1.1990	0.1742	10.4750	5.6387
2500	1.2217	0.1679	10.5359	5.7294
5000	1.2039	0.1717	10.5968	5.7949
7500	1.2121	0.1653	10.6577	5.8050
10000	1.2018	0.1494	10.6577	5.8453
12500	1.2066	0.1707	10.3228	5.7496
15000	1.2399	0.1671	10.8100	5.8201
17500	1.2301	0.1737	10.9013	5.9209
20000	1.2481	0.1520	11.1145	6.0015

Framework versions

Transformers 4.28.1
Datasets 2.11.0
Tokenizers 0.13.3