Update the README about the different synthesizers.
Browse files
app.py
CHANGED
@@ -242,12 +242,19 @@ type=['wav'])
|
|
242 |
with about:
|
243 |
#st.header("How it works")
|
244 |
st.markdown('''# Mockingbird TTS Demo
|
245 |
-
This page is a demo of the openly available Text to Speech models for various languages of interest. Currently,
|
246 |
- [**Meta's Massively Multilingual Speech (MMS)**](https://ai.meta.com/blog/multilingual-model-speech-recognition/) model, which supports over 1000 languages.[^1]
|
247 |
-
- [**Coqui's TTS**](https://docs.coqui.ai/en/latest/#) package;[^2] while no longer supported, Coqui acted as a hub for TTS model hosting and these models are still available.
|
248 |
-
- [**ESpeak-NG's**](https://github.com/espeak-ng/espeak-ng/tree/master)'s synthetic voices**[^3]
|
249 |
- [**IMS Toucan**](https://github.com/DigitalPhonetics/IMS-Toucan), which supports 7000 languages.[^4]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
250 |
- [**Piper**](https://github.com/rhasspy/piper), a TTS system that supports multiple voices per language and approximately 30 languages.[^5]
|
|
|
|
|
251 |
|
252 |
Voice conversion is currently achieved through Coqui.
|
253 |
|
@@ -268,6 +275,7 @@ Notes:
|
|
268 |
[^3]: [Language list](https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md)
|
269 |
[^4]: Language list is available in the Gradio API documentation [here](https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS).
|
270 |
[^5]: The list of available voices is [here](https://github.com/rhasspy/piper/blob/master/VOICES.md), model checkpoints are [here](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main), and they can be tested [here](https://rhasspy.github.io/piper-samples/).
|
|
|
271 |
''')
|
272 |
|
273 |
|
|
|
242 |
with about:
|
243 |
#st.header("How it works")
|
244 |
st.markdown('''# Mockingbird TTS Demo
|
245 |
+
This page is a demo of the openly available Text to Speech models for various languages of interest. Currently, 3 synthesizers with multilingual offerings are supported out of the box:
|
246 |
- [**Meta's Massively Multilingual Speech (MMS)**](https://ai.meta.com/blog/multilingual-model-speech-recognition/) model, which supports over 1000 languages.[^1]
|
|
|
|
|
247 |
- [**IMS Toucan**](https://github.com/DigitalPhonetics/IMS-Toucan), which supports 7000 languages.[^4]
|
248 |
+
- [**ESpeak-NG's**](https://github.com/espeak-ng/espeak-ng/tree/master)'s synthetic voices**[^3]
|
249 |
+
|
250 |
+
On a case-by-case basis, for different languages of interest, I have added:
|
251 |
+
- [**Coqui's TTS**](https://docs.coqui.ai/en/latest/#) package;[^2] while no longer supported, Coqui acted as a hub for TTS model hosting and these models are still available. Languages must be added on a model-by-model basis.
|
252 |
+
- Specific fine-tuned variants of Meta's MMS (either fine-tuned by [Yoach Lacombe](https://huggingface.co/ylacombe), or fine-tuned by me using his scripts).
|
253 |
+
|
254 |
+
I am in the process of adding support for:
|
255 |
- [**Piper**](https://github.com/rhasspy/piper), a TTS system that supports multiple voices per language and approximately 30 languages.[^5]
|
256 |
+
- [**African Voices**](https://github.com/neulab/AfricanVoices), a CMU research project that fine-tuned synthesizers for different African languages. The site hosting the synthesizers is deprecated but they can be downloaded from Google's Wayback Machine. [^6]
|
257 |
+
|
258 |
|
259 |
Voice conversion is currently achieved through Coqui.
|
260 |
|
|
|
275 |
[^3]: [Language list](https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md)
|
276 |
[^4]: Language list is available in the Gradio API documentation [here](https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS).
|
277 |
[^5]: The list of available voices is [here](https://github.com/rhasspy/piper/blob/master/VOICES.md), model checkpoints are [here](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main), and they can be tested [here](https://rhasspy.github.io/piper-samples/).
|
278 |
+
[^6]:
|
279 |
''')
|
280 |
|
281 |
|