metadata

license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: xls-r-300m-hbs-phoneme-unfrozen-batch16
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: hsb
          split: test
          args: hsb
        metrics:
          - name: Wer
            type: wer
            value: 0.5337394564198688

xls-r-300m-hbs-phoneme-unfrozen-batch16

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.9205
Wer: 0.5337
Cer: 0.1244

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
4.0877	3.2258	100	3.7799	1.0	1.0
3.2643	6.4516	200	3.2338	1.0	1.0
3.2182	9.6774	300	3.1963	1.0	1.0
0.8009	12.9032	400	0.9289	0.8240	0.2193
0.2664	16.1290	500	0.8523	0.7381	0.1855
0.1359	19.3548	600	0.8465	0.6757	0.1676
0.1022	22.5806	700	0.8537	0.6603	0.1656
0.0641	25.8065	800	0.8821	0.6664	0.1620
0.0565	29.0323	900	0.9185	0.6610	0.1608
0.068	32.2581	1000	0.8839	0.6286	0.1513
0.0556	35.4839	1100	0.8898	0.6125	0.1479
0.0457	38.7097	1200	0.8840	0.6204	0.1448
0.0439	41.9355	1300	0.9207	0.6249	0.1490
0.0296	45.1613	1400	0.9572	0.6246	0.1510
0.0461	48.3871	1500	0.8875	0.5918	0.1395
0.0419	51.6129	1600	0.8967	0.5846	0.1384
0.0333	54.8387	1700	0.9827	0.5951	0.1420
0.0318	58.0645	1800	0.9055	0.5733	0.1364
0.0238	61.2903	1900	0.9497	0.5696	0.1363
0.0257	64.5161	2000	0.9268	0.5590	0.1330
0.0266	67.7419	2100	0.9374	0.5703	0.1351
0.0292	70.9677	2200	0.9304	0.5754	0.1352
0.0288	74.1935	2300	0.9419	0.5649	0.1334
0.0125	77.4194	2400	0.9625	0.5581	0.1335
0.0241	80.6452	2500	0.9449	0.5569	0.1313
0.0217	83.8710	2600	0.9315	0.5504	0.1292
0.0136	87.0968	2700	0.9079	0.5373	0.1257
0.0203	90.3226	2800	0.8935	0.5373	0.1241
0.0166	93.5484	2900	0.9169	0.5354	0.1239
0.0114	96.7742	3000	0.9245	0.5323	0.1240
0.011	100.0	3100	0.9205	0.5337	0.1244

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1