aioxlabs
/

tacotron-swahili

speech-synthesis

Model card Files Files and versions Community

nairaxo commited on Oct 30, 2022

Commit

9e5e0b0

•

1 Parent(s): 822eb04

Update README.md

Files changed (1) hide show

README.md +6 -7

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ metrics:
 # Text-to-Speech (TTS) with Tacotron2 trained on LJSpeech
-This repository provides all the necessary tools for Text-to-Speech (TTS)  with SpeechBrain using a [Tacotron2](https://arxiv.org/abs/1712.05884) pretrained on [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
 The pre-trained model takes in input a short text and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
@@ -30,7 +30,7 @@ The pre-trained model takes in input a short text and produces a spectrogram in
 pip install speechbrain
 ```
-Please notice that we encourage you to read our tutorials and learn more about
 [SpeechBrain](https://speechbrain.github.io).
 ### Perform Text-to-Speech (TTS)
@@ -41,7 +41,7 @@ from speechbrain.pretrained import Tacotron2
 from speechbrain.pretrained import HIFIGAN
 # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
-tacotron2 = Tacotron2.from_hparams(source="nairaxo/tacotron-swahili", savedir="tmpdir_tts")
 hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
 # Running the TTS
@@ -58,11 +58,10 @@ If you want to generate multiple sentences in one-shot, you can do in this way:
 ```
 from speechbrain.pretrained import Tacotron2
-tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
 items = [
-       "A quick brown fox jumped over the lazy dog",
-       "How much wood would a woodchuck chuck?",
-       "Never odd or even"
      ]
 mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)

 # Text-to-Speech (TTS) with Tacotron2 trained on LJSpeech
+This repository provides all the necessary tools for Text-to-Speech (TTS)  with SpeechBrain using a [Tacotron2](https://arxiv.org/abs/1712.05884) pretrained on [ALLFA Public](https://github.com/getalp/ALFFA_PUBLIC/tree/master/ASR/SWAHILI).
 The pre-trained model takes in input a short text and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
 pip install speechbrain
 ```
+Please notice that we encourage you to read the tutorials and learn more about
 [SpeechBrain](https://speechbrain.github.io).
 ### Perform Text-to-Speech (TTS)
 from speechbrain.pretrained import HIFIGAN
 # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
+tacotron2 = Tacotron2.from_hparams(source="aioxlabs/tacotron-swahili", savedir="tmpdir_tts")
 hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
 # Running the TTS
 ```
 from speechbrain.pretrained import Tacotron2
+tacotron2 = Tacotron2.from_hparams(source="aioxlabs/tacotron-swahili", savedir="tmpdir")
 items = [
+       "raisi wa jumhuri ya tanzania",
+       "soma zaidi"
      ]
 mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)