drewThomasson's picture
Update README.md
5814956 verified
metadata
license: mit
language:
  - en
base_model:
  - coqui/XTTS-v2

Fine-Tuned Xtts Model

This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training.

Training Data

  • Source Video: YouTube Video
  • Training Audio: The mp3 file used for training is included in the files directory.

dockerimage

Fine tuned with this docker image FineTune Xtts Docker image

Notes

  • Ensure you have a GPU available for optimal performance during training.
  • The Docker image pulls the latest version each time it's run.

This model is based on xtts v2 which cannot be used commercially as per the xtts license which is in a limbo state