File size: 1,658 Bytes
fa499c3
 
f3abb15
 
2cbee95
b281e16
 
fa499c3
9e372f1
 
fa499c3
9e372f1
 
fa499c3
 
 
9e372f1
fa499c3
 
 
 
 
 
 
 
 
 
1cb473e
 
9e372f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b281e16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: mit
language:
- ja
pipeline_tag: text-to-speech
datasets:
- litagin/moe-speech
---
Following this guide with exceptions
        https://rentry.org/GPT-SoVITS-guide

I used the latest git pull from 
        https://github.com/RVC-Boss/GPT-SoVITS/

I needed to put:

        export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cudann/lib/
 
in my shell.


put the pth file in your SoVITS_weights_v2 folder and the ckpt in GPT_weights_v2

make both Language for Reference audio and Inference text language "Japanese", set slicing to "Slice by Chinese punct".

you should be able to give it a CLEAN AND NOISE/MUSIC/STATIC free Japanese voice clip from 3-10 seconds, give it a 100% ACCURATE transcription and get very good results out the other side.

you can try the wav file in the repo, using the filename as the "Text for reference audio" to test inference.

Feel free to keep everything else at the deaults

If you want to start the inference engine auomatically, you can use do something like 

        python3 /path/to/GPT_SoVITS/inference_webui.py "Auto"

If you isolate it ala https://rentry.org/IsolatedLinuxWebService and put nginx in front of it with an ssl cert, you need something like this in the location block:

        proxy_pass http://127.0.0.1:9872/;
        proxy_buffering off;
        proxy_redirect off;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        client_max_body_size 500M;
        proxy_set_header X-Forwarded-Proto $scheme;
        add_header 'Content-Security-Policy' 'upgrade-insecure-requests';