metadata
library_name: speechbrain
pipeline_tag: text-to-speech
language: en
tags:
- text-to-speech
- TTS
- speech-synthesis
- speechbrain
license: apache-2.0
datasets:
- LJSpeech
Text-to-Speech (TTS) with Transformer trained on LJSpeech
This repository provides all the necessary tools for Text-to-Speech (TTS) with SpeechBrain using a Transformer pretrained on LJSpeech.
The pre-trained model takes in text input and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
Perform Text-to-Speech (TTS)
import torchaudio
from speechbrain.inference.vocoders import HIFIGAN
texts = ["This is the example text"]
#initializing my model
my_tts_model = TextToSpeech.from_hparams(source="/content/")
#initializing vocoder(Hifigan) model
hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
# Running the TTS
mel_output = my_tts_model.encode_text(texts)
# Running Vocoder (spectrogram-to-waveform)
waveforms = hifi_gan.decode_batch(mel_output)
# Save the waverform
torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)