README.md · chmanoj/xls-r-300m-te at a57098ad5532105112ac239dcfc90bbe61e5f59c

metadata

license: apache-2.0
tags:
  - automatic-speech-recognition
  - openslr_SLR66
  - generated_from_trainer
model-index:
  - name: ''
    results: []

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the OPENSLR_SLR66 - NA dataset. It achieves the following results on the evaluation set:

Loss: 0.2719
Wer: 0.3419

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7.5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2000
num_epochs: 100.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.0304	4.81	500	1.5676	1.0554
1.5263	9.61	1000	0.4693	0.8023
1.5299	14.42	1500	0.4368	0.7311
1.5063	19.23	2000	0.4360	0.7302
1.455	24.04	2500	0.4213	0.6692
1.4755	28.84	3000	0.4329	0.5943
1.352	33.65	3500	0.4074	0.5765
1.3122	38.46	4000	0.3866	0.5630
1.2799	43.27	4500	0.3860	0.5480
1.212	48.08	5000	0.3590	0.5317
1.1645	52.88	5500	0.3283	0.4757
1.0854	57.69	6000	0.3162	0.4687
1.0292	62.5	6500	0.3126	0.4416
0.9607	67.31	7000	0.2990	0.4066
0.9156	72.12	7500	0.2870	0.4009
0.8329	76.92	8000	0.2791	0.3909
0.7979	81.73	8500	0.2770	0.3670
0.7144	86.54	9000	0.2841	0.3661
0.6997	91.35	9500	0.2721	0.3485
0.6568	96.15	10000	0.2681	0.3437

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.0+cu113
Datasets 1.18.1.dev0
Tokenizers 0.10.3