Commit
•
5b9145f
1
Parent(s):
44b3929
Update README.md
Browse files
README.md
CHANGED
@@ -17,17 +17,15 @@ datasets:
|
|
17 |
<img src="https://huggingface.co/datasets/parler-tts/images/resolve/main/thumbnail.png" alt="Parler Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
18 |
|
19 |
|
20 |
-
# Parler-TTS Mini: Expresso
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
<a target="_blank" href="https://huggingface.co/spaces/parler-tts/parler_tts_mini_expresso">
|
25 |
<img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg" alt="Open in HuggingFace"/>
|
26 |
</a>
|
27 |
|
28 |
-
**Parler-TTS Mini: Expresso
|
29 |
on the [Expresso](https://huggingface.co/datasets/ylacombe/expresso) dataset. It is a lightweight text-to-speech (TTS)
|
30 |
-
model that can generate high-quality, natural sounding speech. Compared to the original model, Expresso
|
31 |
superior control over **emotions** (happy, confused, laughing, sad) and **consistent voices** (Jerry, Thomas, Elisabeth, Talia).
|
32 |
|
33 |
It is part of the first release from the [Parler-TTS](https://github.com/huggingface/parler-tts) project, which aims to
|
@@ -72,7 +70,7 @@ sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
|
|
72 |
* The model can generate in a range of emotions, including: "happy", "confused", "default" (meaning no particular emotion conveyed), "laughing", "sad", "whisper", "emphasis"
|
73 |
* Include the term "high quality audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
|
74 |
* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
|
75 |
-
*
|
76 |
|
77 |
## Training Procedure
|
78 |
|
|
|
17 |
<img src="https://huggingface.co/datasets/parler-tts/images/resolve/main/thumbnail.png" alt="Parler Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
18 |
|
19 |
|
20 |
+
# Parler-TTS Mini: Expresso
|
21 |
|
22 |
+
<a target="_blank" href="https://huggingface.co/spaces/parler-tts/parler-tts-expresso">
|
|
|
|
|
23 |
<img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg" alt="Open in HuggingFace"/>
|
24 |
</a>
|
25 |
|
26 |
+
**Parler-TTS Mini: Expresso** is a fine-tuned version of [Parler-TTS Mini v0.1](https://huggingface.co/parler-tts/parler_tts_mini_v0.1)
|
27 |
on the [Expresso](https://huggingface.co/datasets/ylacombe/expresso) dataset. It is a lightweight text-to-speech (TTS)
|
28 |
+
model that can generate high-quality, natural sounding speech. Compared to the original model, Parler-TTS Expresso provides
|
29 |
superior control over **emotions** (happy, confused, laughing, sad) and **consistent voices** (Jerry, Thomas, Elisabeth, Talia).
|
30 |
|
31 |
It is part of the first release from the [Parler-TTS](https://github.com/huggingface/parler-tts) project, which aims to
|
|
|
70 |
* The model can generate in a range of emotions, including: "happy", "confused", "default" (meaning no particular emotion conveyed), "laughing", "sad", "whisper", "emphasis"
|
71 |
* Include the term "high quality audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise
|
72 |
* Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech
|
73 |
+
* To emphasise particular words, wrap them in asterisk (e.g. `*you*` in the example above) and include "emphasis" in the prompt
|
74 |
|
75 |
## Training Procedure
|
76 |
|