Update README.md
Browse files
README.md
CHANGED
@@ -77,11 +77,6 @@ audio_arr = generation.cpu().numpy().squeeze()
|
|
77 |
sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
|
78 |
```
|
79 |
|
80 |
-
**Tips**:
|
81 |
-
* Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
|
82 |
-
* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
|
83 |
-
* The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt
|
84 |
-
|
85 |
### 🎯 Using a specific speaker
|
86 |
|
87 |
To ensure speaker consistency across generations, this checkpoint was also trained on 34 speakers, characterized by name (e.g. Jon, Lea, Gary, Jenna, Mike, Laura).
|
@@ -110,6 +105,11 @@ audio_arr = generation.cpu().numpy().squeeze()
|
|
110 |
sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
|
111 |
```
|
112 |
|
|
|
|
|
|
|
|
|
|
|
113 |
## Motivation
|
114 |
|
115 |
Parler-TTS is a reproduction of work from the paper [Natural language guidance of high-fidelity text-to-speech with synthetic annotations](https://www.text-description-to-speech.com) by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
|
|
|
77 |
sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
|
78 |
```
|
79 |
|
|
|
|
|
|
|
|
|
|
|
80 |
### 🎯 Using a specific speaker
|
81 |
|
82 |
To ensure speaker consistency across generations, this checkpoint was also trained on 34 speakers, characterized by name (e.g. Jon, Lea, Gary, Jenna, Mike, Laura).
|
|
|
105 |
sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
|
106 |
```
|
107 |
|
108 |
+
**Tips**:
|
109 |
+
* Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
|
110 |
+
* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
|
111 |
+
* The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt
|
112 |
+
|
113 |
## Motivation
|
114 |
|
115 |
Parler-TTS is a reproduction of work from the paper [Natural language guidance of high-fidelity text-to-speech with synthetic annotations](https://www.text-description-to-speech.com) by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
|