dmusingu's picture
Added language model
3a5de09
metadata
base_model: facebook/wav2vec2-xls-r-300m
datasets:
  - fleurs
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: wav2vec2-xlsr-fula-google-fleurs-5-hours
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: fleurs
          type: fleurs
          config: ff_sn
          split: None
          args: ff_sn
        metrics:
          - type: wer
            value: 0.646049896049896
            name: Wer

wav2vec2-xlsr-fula-google-fleurs-5-hours

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1949
  • Wer: 0.6460
  • Cer: 0.2359

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Wer Cer
7.1138 10.96 200 2.9561 1.0 1.0
2.8708 21.92 400 2.0221 1.0 0.6369
1.0031 32.88 600 0.9750 0.6509 0.2222
0.4471 43.84 800 1.1949 0.6460 0.2359

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.17.0
  • Tokenizers 0.15.2