Spaces:

sethuiyer
/

ttsdoc

Runtime error

ttsdoc / README.md

Sethu Iyer

App added

020af7d 3 months ago

1.48 kB

	---
	title: ttsdoc
	emoji: 🌖
	colorFrom: yellow
	colorTo: gray
	sdk: gradio
	sdk_version: 4.41.0
	app_file: app.py
	pinned: false
	license: apache-2.0
	---
	# ttsdoc 🌖

	ttsdoc is a Text-to-Speech (TTS) application that can read your PDF documents aloud. It uses the Parler TTS Mini v1 model to generate high-quality audio from text inputs, including uploaded PDF files.

	## Features

	- 📄 Support for PDF, TXT, and DOCX file uploads
	- ✍️ Direct text input option
	- 🗣️ Customizable voice descriptions
	- ⏱️ Adjustable maximum audio duration
	- 🚀 GPU-accelerated audio generation

	## How to Use

	1. Upload a PDF, TXT, or DOCX file or enter text directly.
	2. Customize the voice description if desired.
	3. Adjust the maximum audio duration.
	4. Click "Generate Audio" to create the TTS output.

	## Tips for Best Results

	- For longer texts, the generator will create audio up to the specified maximum duration.
	- Experiment with different voice descriptions to achieve the desired output.
	- Use punctuation to control pacing and intonation in the generated speech.
	- For optimal quality, try to keep individual sentences or paragraphs concise.

	## Technical Details

	- This demo uses the Parler TTS Mini v1 model.
	- Audio generation is GPU-accelerated for faster processing.
	- Maximum file size for uploads: 5MB

	## License

	This project is licensed under the Apache 2.0 License.

	---

	Powered by [Gradio](https://gradio.app) and [Hugging Face](https://huggingface.co)