File size: 1,146 Bytes

---

language: pt
datasets:
- CORAA
- common_voice 
- mls
- cetuc
- voxforge
metrics:
- wer
tags:
- audio
- speech
- wav2vec2
- pt
- portuguese-speech-corpus
- automatic-speech-recognition
- speech
- PyTorch
license: apache-2.0
model-index:
- name: Alef Iury XLSR Wav2Vec2 Large 53 Portuguese
  results:
  - task: 
      name: Speech Recognition
      type: automatic-speech-recognition
    metrics:
       - name: Test CORAA WER
         type: wer
         value: 24.89%
---


# Wav2vec 2.0 trained with CORAA Portuguese Dataset and Open Portuguese Datasets

This a the demonstration of a fine-tuned Wav2vec model for Portuguese using the following  datasets:

- [CORAA dataset](https://github.com/nilc-nlp/CORAA)
- [CETUC](http://www02.smt.ufrj.br/~igor.quintanilha/alcaim.tar.gz).
- [Multilingual Librispeech (MLS)](http://www.openslr.org/94/).
- [VoxForge](http://www.voxforge.org/).
- [Common Voice 6.1](https://commonvoice.mozilla.org/pt).

## Repository

The repository that implements the model to be trained and tested is avaible [here](https://github.com/alefiury/SE-R_2022_Challenge_Wav2vec2).