|
--- |
|
language: es |
|
tags: |
|
- audio |
|
- automatic-speech-recognition |
|
- voxpopuli |
|
license: cc-by-nc-4.0 |
|
--- |
|
|
|
# Wav2Vec2-Base-VoxPopuli-Finetuned |
|
|
|
[Facebook's Wav2Vec2](https://ai.facebook.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/) base model pretrained on the 10K unlabeled subset of [VoxPopuli corpus](https://arxiv.org/abs/2101.00390) and fine-tuned on the transcribed data in es (refer to Table 1 of paper for more information). |
|
|
|
**Paper**: *[VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation |
|
Learning, Semi-Supervised Learning and Interpretation](https://arxiv.org/abs/2101.00390)* |
|
|
|
**Authors**: *Changhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux* from *Facebook AI* |
|
|
|
See the official website for more information, [here](https://github.com/facebookresearch/voxpopuli/) |
|
|