File size: 1,146 Bytes
1c2f47c
2f619cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c2f47c
2f619cc
 
 
 
 
 
 
 
 
 
1c2f47c
787eb3b
 
 
 
 
 
 
 
 
0d21526
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---

language: pt
datasets:
- CORAA
- common_voice 
- mls
- cetuc
- voxforge
metrics:
- wer
tags:
- audio
- speech
- wav2vec2
- pt
- portuguese-speech-corpus
- automatic-speech-recognition
- speech
- PyTorch
license: apache-2.0
model-index:
- name: Alef Iury XLSR Wav2Vec2 Large 53 Portuguese
  results:
  - task: 
      name: Speech Recognition
      type: automatic-speech-recognition
    metrics:
       - name: Test CORAA WER
         type: wer
         value: 24.89%
---


# Wav2vec 2.0 trained with CORAA Portuguese Dataset and Open Portuguese Datasets

This a the demonstration of a fine-tuned Wav2vec model for Portuguese using the following  datasets:

- [CORAA dataset](https://github.com/nilc-nlp/CORAA)
- [CETUC](http://www02.smt.ufrj.br/~igor.quintanilha/alcaim.tar.gz).
- [Multilingual Librispeech (MLS)](http://www.openslr.org/94/).
- [VoxForge](http://www.voxforge.org/).
- [Common Voice 6.1](https://commonvoice.mozilla.org/pt).

## Repository

The repository that implements the model to be trained and tested is avaible [here](https://github.com/alefiury/SE-R_2022_Challenge_Wav2vec2).