microsoft
/

unispeech-sat-large

Inference Endpoints

Model card Files Files and versions Community

patrickvonplaten commited on Oct 21, 2021

Commit

8844157

•

1 Parent(s): c12bf04

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -2,7 +2,6 @@
 language:
 - en
 datasets:
-- common_voice
 tags:
 - speech
 license: apache-2.0
@@ -14,6 +13,12 @@ license: apache-2.0
 The large model pretrained on 16kHz sampled speech audio with utterance and speaker contrastive loss. When using the model, make sure that your speech input is also sampled at 16kHz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more an in-detail explanation of how to fine-tune the model.
 [Paper: UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER
 AWARE PRE-TRAINING](https://arxiv.org/abs/2110.05752)

 language:
 - en
 datasets:
 tags:
 - speech
 license: apache-2.0
 The large model pretrained on 16kHz sampled speech audio with utterance and speaker contrastive loss. When using the model, make sure that your speech input is also sampled at 16kHz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more an in-detail explanation of how to fine-tune the model.
+The model was pre-trained on:
+- 60,000 hours of [Libri-Light](https://arxiv.org/abs/1912.07875)
+- 10,000 hours of [GigaSpeech](https://arxiv.org/abs/2106.06909)
+- 24,000 hours of [VoxPopuli](https://arxiv.org/abs/2101.00390)
 [Paper: UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER
 AWARE PRE-TRAINING](https://arxiv.org/abs/2110.05752)