Spaces:

ntt123
/

Vietnam-male-voice-TTS

Running

App Files Files Community

Increase the output time?

by phamhuyhung - opened Aug 24, 2023

Discussion

phamhuyhung

Aug 24, 2023

Hello, is there a way to increase the output time? Currently, I can only generate about 30 seconds.

ntt123

Owner Aug 24, 2023

Hi, there's a limit on max input length here: https://huggingface.co/spaces/ntt123/Vietnam-male-voice-TTS/blob/64347565e275d8ac02b3055ea4c03f2ea368585d/app.py#L139
Remove it if you want to generate longer clips.

phamhuyhung

Aug 24, 2023

It appears that the resources available on Hugging Face are restricted, which in turn prevents the generation of longer audio segments.
Would it be possible for you to provide more comprehensive guidelines regarding the installation process on a computer and the step-by-step procedures for data preparation, model training, and other related tasks? This way, individuals without programming backgrounds like myself could potentially carry out these tasks on our own computers.
Thank you, in the end, for initiating and sharing this remarkable project with the community.

ntt123

Owner Aug 24, 2023

Hi, please refer to the project at NTT123/light-speed for detailed information regarding data preparation, model training, etc.

phamhuyhung

Aug 24, 2023

I genuinely apologize, but as someone new to programming like myself, I couldn't comprehend any of the instructions on GitHub, not even starting from the computer setup step. I hope that when you have the time, you could rewrite the instructions step by step so that people like me can continue to contribute to the project.

baobao01

Aug 26, 2023

•

edited Aug 26, 2023

Hi, there's a limit on max input length here: https://huggingface.co/spaces/ntt123/Vietnam-male-voice-TTS/blob/64347565e275d8ac02b3055ea4c03f2ea368585d/app.py#L139
Remove it if you want to generate longer clips.

I deleted this code in app.py file and rebuilt docker image.

if len(text) > 500:
        text = text[:500]

The app runs for about 30 seconds and then crashes when I input a 5000-word text. If I'm running Docker locally (http://localhost:7860), the container shuts down. I haven't found any error logs in the container. If entering a text with a length of 800 words, no errors occur. Please help, thank you so much!

ntt123

Owner Aug 26, 2023

Hi, the program likely crashes with long clips due to out-of- memory error.
To avoid this, create shorter clips at the sentence or paragraph level and then combine them to make a long clip. This also aligns with training data that uses 5-10 second clips.
Here is a template: Long clip = short clip 1 + 400ms silence + short clip 2 + 400ms silence + short clip 3 + ...
You can use Python libraries for audio manipulation, such as pydub. Here's the link: pydub API.

ntt123

Owner Aug 26, 2023

Hi @baobao01 @phamhuyhung , I've added the feature to generate long clips in the demo. Thank you for your comments.

baobao01

Aug 26, 2023

•

edited Aug 26, 2023

Thank you very much, I provided an input of 10000 words and the application worked perfectly.

phamhuyhung

Aug 26, 2023

@baobao01 Hello, since the app developer is quite busy, could you please provide me with a way to contact you so I can ask about how to install the app on localhost like you did?

phamhuyhung

Aug 26, 2023

•

edited Aug 26, 2023

@ntt123 Hi, I discovered an error after you added the feature to create long clips in the demo. When creating a paragraph of about 60 seconds, a part will be lost and then it will continue to read.

baobao01

Aug 27, 2023

•

edited Aug 27, 2023

@phamhuyhung You should install it on your localhost for the most accurate results, as the app running on this space is only for demonstration purposes and may be affected by hardware limitations of the free account or network connectivity issues. Please contact me via [email protected] for assistance.

phamhuyhung

Aug 27, 2023

@baobao01 Hi, I have sent you an email, I hope you can spare some time to help me. Thank you very much

phamhuyhung changed discussion status to closed Aug 29, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment