To run this model, you need to follow the XTTS inference code, with documentation here.
It was finetuned on the included WAV file of 4 minutes of HAL9000 speech from the 2001: A Space Odyssey film. A sample of produced speech will be provided.
This finetuning is the first trial. The wav file is seen to contain many skips over speech and some music at the end, which likely damages the quality of the output. The speech in future will be edited and closely monitored for artifacts, and enhanced with the Adobe Podcast service (where the 85p indicates the extent to which the output is enhanced).
- Downloads last month
- 16