NbAiLab
/

salmon-whisper-large-smj-lr5e-5-s30k

Automatic Speech Recognition

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

Edit model card

salmon-whisper-large-smj-lr5e-5

This model is a fine-tuned version of openai/whisper-large-v2 on the NbAiLab/salmon-asr-smj dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
lr_scheduler_type: linear
per_device_train_batch_size: 6
total_train_batch_size_per_node: 48
total_train_batch_size: 48
total_optimization_steps: 100,000
starting_optimization_step: None
finishing_optimization_step: 100,000
num_train_dataset_workers: 32
num_hosts: 1
total_num_training_examples: 4,800,000
steps_per_epoch: 385
num_beams: None
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.98
adam_epsilon: 1e-06
dropout: True
bpe_dropout_probability: 0.2
activation_dropout_probability: 0.1

Training results

step	validation_loss	train_loss	validation_wer	validation_cer	validation_exact_wer	validation_exact_cer
0	4.2254	4.6413	112.7660	59.8700	108.1117	62.0594
10000	0.8720	0.3747	18.2181	5.2803	21.4096	5.6762
20000	1.1365	0.2741	15.2926	4.6304	18.0851	5.0588
30000	1.2561	0.2111	14.6277	4.0617	17.9521	4.5011

Framework versions

Transformers 4.35.0
Datasets 2.14.6
Tokenizers 0.14.1

Downloads last month: 1

Safetensors

Model size

1.54B params

Tensor type

F32

·

Inference API

Automatic Speech Recognition

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for NbAiLab/salmon-whisper-large-smj-lr5e-5-s30k

Base model

openai/whisper-large-v2

Finetuned

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard