File size: 1,658 Bytes
fa499c3 f3abb15 2cbee95 b281e16 fa499c3 9e372f1 fa499c3 9e372f1 fa499c3 9e372f1 fa499c3 1cb473e 9e372f1 b281e16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
---
license: mit
language:
- ja
pipeline_tag: text-to-speech
datasets:
- litagin/moe-speech
---
Following this guide with exceptions
https://rentry.org/GPT-SoVITS-guide
I used the latest git pull from
https://github.com/RVC-Boss/GPT-SoVITS/
I needed to put:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cudann/lib/
in my shell.
put the pth file in your SoVITS_weights_v2 folder and the ckpt in GPT_weights_v2
make both Language for Reference audio and Inference text language "Japanese", set slicing to "Slice by Chinese punct".
you should be able to give it a CLEAN AND NOISE/MUSIC/STATIC free Japanese voice clip from 3-10 seconds, give it a 100% ACCURATE transcription and get very good results out the other side.
you can try the wav file in the repo, using the filename as the "Text for reference audio" to test inference.
Feel free to keep everything else at the deaults
If you want to start the inference engine auomatically, you can use do something like
python3 /path/to/GPT_SoVITS/inference_webui.py "Auto"
If you isolate it ala https://rentry.org/IsolatedLinuxWebService and put nginx in front of it with an ssl cert, you need something like this in the location block:
proxy_pass http://127.0.0.1:9872/;
proxy_buffering off;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
client_max_body_size 500M;
proxy_set_header X-Forwarded-Proto $scheme;
add_header 'Content-Security-Policy' 'upgrade-insecure-requests'; |