khushi1234455687's picture
Upload tokenizer
bbffe78 verified
metadata
base_model: facebook/wav2vec2-large-xlsr-53
datasets:
  - fleurs
library_name: transformers
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: wav2vec2-large-xlsr-oria-v0
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: fleurs
          type: fleurs
          config: or_in
          split: None
          args: or_in
        metrics:
          - type: wer
            value: 0.4972150445018662
            name: Wer

wav2vec2-large-xlsr-oria-v0

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6050
  • Wer: 0.4972

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
6.6722 2.1505 100 5.4667 1.0
3.4956 4.3011 200 3.4905 1.0
3.4526 6.4516 300 3.4624 1.0
3.4548 8.6022 400 3.4494 1.0
3.4352 10.7527 500 3.4267 1.0
3.0335 12.9032 600 2.8300 1.0
1.021 15.0538 700 0.9941 0.7938
0.6175 17.2043 800 0.7318 0.6385
0.5257 19.3548 900 0.6485 0.5820
0.4232 21.5054 1000 0.6105 0.5430
0.3202 23.6559 1100 0.5906 0.5192
0.2767 25.8065 1200 0.6025 0.5079
0.2679 27.9570 1300 0.6050 0.4972

Framework versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1