metadata

license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: xls-r-300-cv17-bulgarian
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: bg
          split: validation
          args: bg
        metrics:
          - name: Wer
            type: wer
            value: 0.2967878948765596

xls-r-300-cv17-bulgarian

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.4329
Wer: 0.2968
Cer: 0.0726

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
4.0388	0.6579	100	4.1422	1.0	1.0
3.047	1.3158	200	3.0730	1.0	1.0
2.7349	1.9737	300	2.7601	0.9939	0.9946
0.6047	2.6316	400	0.6984	0.7954	0.1942
0.3868	3.2895	500	0.5550	0.5994	0.1519
0.3423	3.9474	600	0.4548	0.4804	0.1195
0.1942	4.6053	700	0.3973	0.4277	0.1034
0.1754	5.2632	800	0.4166	0.4391	0.1055
0.1734	5.9211	900	0.4146	0.4195	0.1018
0.1089	6.5789	1000	0.3859	0.3867	0.0937
0.233	7.2368	1100	0.4183	0.4054	0.1005
0.1519	7.8947	1200	0.4459	0.4151	0.1030
0.1176	8.5526	1300	0.4026	0.3845	0.0937
0.0997	9.2105	1400	0.3849	0.3590	0.0869
0.1266	9.8684	1500	0.4281	0.3781	0.0947
0.0945	10.5263	1600	0.4471	0.3983	0.0979
0.0575	11.1842	1700	0.4290	0.3660	0.0897
0.0854	11.8421	1800	0.4258	0.3749	0.0938
0.0558	12.5	1900	0.4242	0.3644	0.0907
0.0774	13.1579	2000	0.4339	0.3616	0.0888
0.0397	13.8158	2100	0.4155	0.3581	0.0882
0.0603	14.4737	2200	0.4681	0.3737	0.0943
0.0723	15.1316	2300	0.4446	0.3560	0.0875
0.0746	15.7895	2400	0.4430	0.3573	0.0889
0.0727	16.4474	2500	0.4549	0.3470	0.0870
0.0458	17.1053	2600	0.4581	0.3520	0.0873
0.0694	17.7632	2700	0.4414	0.3575	0.0896
0.0462	18.4211	2800	0.4235	0.3261	0.0802
0.0539	19.0789	2900	0.4496	0.3329	0.0810
0.0368	19.7368	3000	0.4043	0.3406	0.0846
0.0347	20.3947	3100	0.4367	0.3225	0.0789
0.019	21.0526	3200	0.4487	0.3272	0.0801
0.0361	21.7105	3300	0.4272	0.3241	0.0785
0.0475	22.3684	3400	0.4324	0.3191	0.0781
0.0341	23.0263	3500	0.4564	0.3398	0.0847
0.0454	23.6842	3600	0.4415	0.3188	0.0789
0.0346	24.3421	3700	0.4187	0.3072	0.0751
0.1315	25.0	3800	0.4480	0.3124	0.0765
0.0663	25.6579	3900	0.4488	0.3151	0.0779
0.0225	26.3158	4000	0.4372	0.3006	0.0739
0.0382	26.9737	4100	0.4164	0.2987	0.0730
0.0194	27.6316	4200	0.4190	0.2942	0.0718
0.0101	28.2895	4300	0.4328	0.2960	0.0726
0.0224	28.9474	4400	0.4302	0.2944	0.0720
0.0174	29.6053	4500	0.4329	0.2968	0.0726

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1