metadata

license: mit
language:
  - en
base_model:
  - coqui/XTTS-v2

Fine-Tuned Xtts Model

This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training.

Training Data

Source Video: YouTube Video
Training Audio: The mp3 file used for training is included in the files directory.

dockerimage

Fine tuned with this docker image FineTune Xtts Docker image

Notes

Ensure you have a GPU available for optimal performance during training.
The Docker image pulls the latest version each time it's run.

This model is based on xtts v2 which cannot be used commercially as per the xtts license which is in a limbo state