metadata

library_name: transformers
license: apache-2.0
base_model: facebook/wav2vec2-base
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: wav2vec2-romanian-test
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: ro
          split: test
          args: ro
        metrics:
          - name: Wer
            type: wer
            value: 0.9989733059548255

wav2vec2-romanian-test

This model is a fine-tuned version of facebook/wav2vec2-base on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.3928
Wer: 0.9990

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
4.4031	1.7730	500	1.7235	1.0
0.8308	3.5461	1000	0.5378	0.9997
0.4317	5.3191	1500	0.4410	0.9995
0.3127	7.0922	2000	0.4157	0.9992
0.2468	8.8652	2500	0.4119	0.9987
0.2086	10.6383	3000	0.3922	0.9995
0.1787	12.4113	3500	0.3861	0.9990
0.1601	14.1844	4000	0.3829	0.9987
0.1459	15.9574	4500	0.3929	0.9990
0.1315	17.7305	5000	0.3983	0.9990
0.1218	19.5035	5500	0.4068	0.9987
0.1138	21.2766	6000	0.4139	0.9990
0.107	23.0496	6500	0.3851	0.9990
0.0983	24.8227	7000	0.3820	0.9992
0.0937	26.5957	7500	0.3962	0.9990
0.0909	28.3688	8000	0.3928	0.9990

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu124
Datasets 2.21.0
Tokenizers 0.19.1