ttsdoc / README.md
Sethu Iyer
App added
020af7d
---
title: ttsdoc
emoji: πŸŒ–
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 4.41.0
app_file: app.py
pinned: false
license: apache-2.0
---
# ttsdoc πŸŒ–
ttsdoc is a Text-to-Speech (TTS) application that can read your PDF documents aloud. It uses the Parler TTS Mini v1 model to generate high-quality audio from text inputs, including uploaded PDF files.
## Features
- πŸ“„ Support for PDF, TXT, and DOCX file uploads
- ✍️ Direct text input option
- πŸ—£οΈ Customizable voice descriptions
- ⏱️ Adjustable maximum audio duration
- πŸš€ GPU-accelerated audio generation
## How to Use
1. Upload a PDF, TXT, or DOCX file or enter text directly.
2. Customize the voice description if desired.
3. Adjust the maximum audio duration.
4. Click "Generate Audio" to create the TTS output.
## Tips for Best Results
- For longer texts, the generator will create audio up to the specified maximum duration.
- Experiment with different voice descriptions to achieve the desired output.
- Use punctuation to control pacing and intonation in the generated speech.
- For optimal quality, try to keep individual sentences or paragraphs concise.
## Technical Details
- This demo uses the Parler TTS Mini v1 model.
- Audio generation is GPU-accelerated for faster processing.
- Maximum file size for uploads: 5MB
## License
This project is licensed under the Apache 2.0 License.
---
Powered by [Gradio](https://gradio.app) and [Hugging Face](https://huggingface.co)