metadata

license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: xls-r-300-cv17-bulgarian-adap-ru
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: bg
          split: validation
          args: bg
        metrics:
          - name: Wer
            type: wer
            value: 0.3023246994576965

xls-r-300-cv17-bulgarian-adap-ru

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.3977
Wer: 0.3023
Cer: 0.0722

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.1617	0.6579	100	3.1554	1.0	1.0
1.0032	1.3158	200	1.0726	0.8684	0.2419
0.5552	1.9737	300	0.4924	0.5297	0.1303
0.2763	2.6316	400	0.3795	0.4442	0.1043
0.2273	3.2895	500	0.3769	0.4222	0.1014
0.3216	3.9474	600	0.3611	0.3993	0.0971
0.1553	4.6053	700	0.3566	0.3927	0.0936
0.1414	5.2632	800	0.3676	0.3869	0.0923
0.1774	5.9211	900	0.3680	0.3758	0.0901
0.1256	6.5789	1000	0.3637	0.3775	0.0916
0.2416	7.2368	1100	0.3893	0.3963	0.0951
0.1213	7.8947	1200	0.3677	0.3596	0.0864
0.0911	8.5526	1300	0.3850	0.3739	0.0891
0.0859	9.2105	1400	0.3962	0.3658	0.0883
0.0998	9.8684	1500	0.3608	0.3530	0.0846
0.108	10.5263	1600	0.3932	0.3908	0.0920
0.0824	11.1842	1700	0.4147	0.3591	0.0870
0.0888	11.8421	1800	0.4040	0.3660	0.0878
0.0609	12.5	1900	0.4097	0.3542	0.0857
0.0692	13.1579	2000	0.4127	0.3639	0.0874
0.0513	13.8158	2100	0.4118	0.3560	0.0870
0.0752	14.4737	2200	0.4044	0.3591	0.0888
0.0833	15.1316	2300	0.3956	0.3374	0.0812
0.0826	15.7895	2400	0.3953	0.3356	0.0811
0.0934	16.4474	2500	0.4053	0.3394	0.0819
0.0562	17.1053	2600	0.4243	0.3534	0.0843
0.0661	17.7632	2700	0.4021	0.3340	0.0791
0.0496	18.4211	2800	0.4052	0.3387	0.0818
0.0599	19.0789	2900	0.4101	0.3385	0.0806
0.0446	19.7368	3000	0.3990	0.3362	0.0810
0.0482	20.3947	3100	0.4077	0.3274	0.0781
0.0309	21.0526	3200	0.4343	0.3397	0.0817
0.0757	21.7105	3300	0.4154	0.3252	0.0781
0.0377	22.3684	3400	0.4273	0.3206	0.0770
0.0282	23.0263	3500	0.3998	0.3159	0.0751
0.0676	23.6842	3600	0.3960	0.3111	0.0745
0.0673	24.3421	3700	0.3997	0.3100	0.0741
0.1793	25.0	3800	0.4065	0.3106	0.0738
0.0572	25.6579	3900	0.3951	0.3098	0.0739
0.0208	26.3158	4000	0.4097	0.3106	0.0740
0.0562	26.9737	4100	0.4016	0.3081	0.0734
0.0314	27.6316	4200	0.3939	0.3008	0.0715
0.0235	28.2895	4300	0.4008	0.3023	0.0720
0.0443	28.9474	4400	0.3963	0.3033	0.0724
0.027	29.6053	4500	0.3977	0.3023	0.0722

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1