kevinwang676's picture
Upload folder using huggingface_hub
1503e4f verified
<div align="center">
<h1>GPT-SoVITS-WebUI</h1>
์†Œ๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋กœ ์Œ์„ฑ ๋ณ€ํ™˜ ๋ฐ ์Œ์„ฑ ํ•ฉ์„ฑ์„ ์ง€์›ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ WebUI.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/RVC-Boss/GPT-SoVITS)
<img src="https://counter.seku.su/cmoe?name=gptsovits&theme=r34" /><br>
[![Open In Colab](https://img.shields.io/badge/Colab-F9AB00?style=for-the-badge&logo=googlecolab&color=525252)](https://colab.research.google.com/github/RVC-Boss/GPT-SoVITS/blob/main/colab_webui.ipynb)
[![License](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/LICENSE)
[![Huggingface](https://img.shields.io/badge/๐Ÿค—%20-Models%20Repo-yellow.svg?style=for-the-badge)](https://huggingface.co/lj1995/GPT-SoVITS/tree/main)
[![Discord](https://img.shields.io/discord/1198701940511617164?color=%23738ADB&label=Discord&style=for-the-badge)](https://discord.gg/dnrgs5GHfG)
[**English**](../../README.md) | [**ไธญๆ–‡็ฎ€ไฝ“**](../cn/README.md) | [**ๆ—ฅๆœฌ่ชž**](../ja/README.md) | **ํ•œ๊ตญ์–ด** | [**Tรผrkรงe**](../tr/README.md)
</div>
---
## ๊ธฐ๋Šฅ:
1. **์ œ๋กœ์ƒท ํ…์ŠคํŠธ ์Œ์„ฑ ๋ณ€ํ™˜ (TTS):** 5์ดˆ์˜ ์Œ์„ฑ ์ƒ˜ํ”Œ์„ ์ž…๋ ฅํ•˜๋ฉด ์ฆ‰์‹œ ํ…์ŠคํŠธ๋ฅผ ์Œ์„ฑ์œผ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2. **์†Œ๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ TTS:** 1๋ถ„์˜ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜์—ฌ ์Œ์„ฑ ์œ ์‚ฌ๋„์™€ ์‹ค์ œ๊ฐ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
3. **๋‹ค๊ตญ์–ด ์ง€์›:** ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋‹ค๋ฅธ ์–ธ์–ด์˜ ์ถ”๋ก ์„ ์ง€์›ํ•˜๋ฉฐ, ํ˜„์žฌ ์˜์–ด, ์ผ๋ณธ์–ด, ์ค‘๊ตญ์–ด๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
4. **WebUI ๋„๊ตฌ:** ์Œ์„ฑ ๋ฐ˜์ฃผ ๋ถ„๋ฆฌ, ์ž๋™ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹ ๋ถ„ํ• , ์ค‘๊ตญ์–ด ์ž๋™ ์Œ์„ฑ ์ธ์‹(ASR) ๋ฐ ํ…์ŠคํŠธ ์ฃผ์„ ๋“ฑ์˜ ๋„๊ตฌ๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ์ดˆ๋ณด์ž๊ฐ€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹๊ณผ GPT/SoVITS ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค๋‹ˆ๋‹ค.
**๋ฐ๋ชจ ๋น„๋””์˜ค๋ฅผ ํ™•์ธํ•˜์„ธ์š”! [demo video](https://www.bilibili.com/video/BV12g4y1m7Uw)**
๋ณด์ง€ ๋ชปํ•œ ๋ฐœํ™”์ž์˜ ํ“จ์ƒท(few-shot) ํŒŒ์ธํŠœ๋‹ ๋ฐ๋ชจ:
https://github.com/RVC-Boss/GPT-SoVITS/assets/129054828/05bee1fa-bdd8-4d85-9350-80c060ab47fb
**์‚ฌ์šฉ์ž ์„ค๋ช…์„œ: [็ฎ€ไฝ“ไธญๆ–‡](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
## ์„ค์น˜
### ํ…Œ์ŠคํŠธ ํ†ต๊ณผ ํ™˜๊ฒฝ
- Python 3.9, PyTorch 2.0.1, CUDA 11
- Python 3.10.13, PyTorch 2.1.2, CUDA 12.3
- Python 3.9, Pytorch 2.2.2, macOS 14.4.1 (Apple Slilicon)
- Python 3.9, PyTorch 2.2.2, CPU ์žฅ์น˜
_์ฐธ๊ณ : numba==0.56.4 ๋Š” python<3.11 ์„ ํ•„์š”๋กœ ํ•ฉ๋‹ˆ๋‹ค._
### Windows
Windows ์‚ฌ์šฉ์ž๋ผ๋ฉด (win>=10์—์„œ ํ…Œ์ŠคํŠธ๋จ), [ํ†ตํ•ฉ ํŒจํ‚ค์ง€๋ฅผ ๋‹ค์šด๋กœ๋“œ](https://huggingface.co/lj1995/GPT-SoVITS-windows-package/resolve/main/GPT-SoVITS-beta.7z?download=true)ํ•œ ํ›„ ์••์ถ•์„ ํ’€๊ณ  _go-webui.bat_ ํŒŒ์ผ์„ ๋”๋ธ” ํด๋ฆญํ•˜๋ฉด GPT-SoVITS-WebUI๋ฅผ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
### Linux
```bash
conda create -n GPTSoVits python=3.9
conda activate GPTSoVits
bash install.sh
```
### macOS
**์ฃผ์˜: Mac์—์„œ GPU๋กœ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์€ ๋‹ค๋ฅธ OS์—์„œ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์— ๋น„ํ•ด ํ’ˆ์งˆ์ด ๋‚ฎ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์ „๊นŒ์ง€ MacOS์—์„  CPU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.**
1. `xcode-select --install`์„ ์‹คํ–‰ํ•˜์—ฌ Xcode ์ปค๋งจ๋“œ๋ผ์ธ ๋„๊ตฌ๋ฅผ ์„ค์น˜ํ•˜์„ธ์š”.
2. `brew install ffmpeg` ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์—ฌ FFmpeg๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.
3. ์œ„์˜ ๋‹จ๊ณ„๋ฅผ ์™„๋ฃŒํ•œ ํ›„, ๋‹ค์Œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ์ด ํ”„๋กœ์ ํŠธ๋ฅผ ์„ค์น˜ํ•˜์„ธ์š”.
```bash
conda create -n GPTSoVits python=3.9
conda activate GPTSoVits
pip install -r requirements.txt
```
### ์ˆ˜๋™ ์„ค์น˜
#### ์˜์กด์„ฑ ์„ค์น˜
```bash
pip install -r requirements.txt
```
#### FFmpeg ์„ค์น˜
##### Conda ์‚ฌ์šฉ์ž
```bash
conda install ffmpeg
```
##### Ubuntu/Debian ์‚ฌ์šฉ์ž
```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
conda install -c conda-forge 'ffmpeg<7'
```
##### Windows ์‚ฌ์šฉ์ž
[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)์™€ [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)๋ฅผ GPT-SoVITS root ๋””๋ ‰ํ† ๋ฆฌ์— ๋„ฃ์Šต๋‹ˆ๋‹ค.
##### MacOS ์‚ฌ์šฉ์ž
```bash
brew install ffmpeg
```
### Docker์—์„œ ์‚ฌ์šฉ
#### docker-compose.yaml ์„ค์ •
0. ์ด๋ฏธ์ง€ ํƒœ๊ทธ: ์ฝ”๋“œ ์ €์žฅ์†Œ๊ฐ€ ๋น ๋ฅด๊ฒŒ ์—…๋ฐ์ดํŠธ๋˜๊ณ  ํŒจํ‚ค์ง€๊ฐ€ ๋Š๋ฆฌ๊ฒŒ ๋นŒ๋“œ๋˜๊ณ  ํ…Œ์ŠคํŠธ๋˜๋ฏ€๋กœ, ํ˜„์žฌ ๋นŒ๋“œ๋œ ์ตœ์‹  ๋„์ปค ์ด๋ฏธ์ง€๋ฅผ [Docker Hub](https://hub.docker.com/r/breakstring/gpt-sovits)์—์„œ ํ™•์ธํ•˜๊ณ  ํ•„์š”์— ๋”ฐ๋ผ Dockerfile์„ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ์ปฌ์—์„œ ๋นŒ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
1. ํ™˜๊ฒฝ ๋ณ€์ˆ˜:
- is_half: ๋ฐ˜์ •๋ฐ€/๋ฐฐ์ •๋ฐ€ ์ œ์–ด. "SSL ์ถ”์ถœ" ๋‹จ๊ณ„์—์„œ 4-cnhubert/5-wav32k ๋””๋ ‰ํ† ๋ฆฌ์˜ ๋‚ด์šฉ์„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ƒ์„ฑํ•  ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ด๊ฒƒ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์‹ค์ œ ์ƒํ™ฉ์— ๋”ฐ๋ผ True ๋˜๋Š” False๋กœ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2. ๋ณผ๋ฅจ ์„ค์ •, ์ปจํ…Œ์ด๋„ˆ ๋‚ด์˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋ฃจํŠธ ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ /workspace๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ docker-compose.yaml์—๋Š” ์‹ค์ œ ์˜ˆ์ œ๊ฐ€ ๋‚˜์—ด๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ ์—…๋กœ๋“œ/๋‹ค์šด๋กœ๋“œ๋ฅผ ์‰ฝ๊ฒŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
3. shm_size: Windows์˜ Docker Desktop์˜ ๊ธฐ๋ณธ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋„ˆ๋ฌด ์ž‘์•„ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์‹ค์ œ ์ƒํ™ฉ์— ๋”ฐ๋ผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
4. deploy ์„น์…˜์˜ gpu ๊ด€๋ จ ๋‚ด์šฉ์€ ์‹œ์Šคํ…œ ๋ฐ ์‹ค์ œ ์ƒํ™ฉ์— ๋”ฐ๋ผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
#### docker compose๋กœ ์‹คํ–‰
```
docker compose -f "docker-compose.yaml" up -d
```
#### docker ๋ช…๋ น์œผ๋กœ ์‹คํ–‰
์œ„์™€ ๋™์ผํ•˜๊ฒŒ ์‹ค์ œ ์ƒํ™ฉ์— ๋งž๊ฒŒ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ˆ˜์ •ํ•œ ๋‹ค์Œ ๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค:
```
docker run --rm -it --gpus=all --env=is_half=False --volume=G:\GPT-SoVITS-DockerTest\output:/workspace/output --volume=G:\GPT-SoVITS-DockerTest\logs:/workspace/logs --volume=G:\GPT-SoVITS-DockerTest\SoVITS_weights:/workspace/SoVITS_weights --workdir=/workspace -p 9880:9880 -p 9871:9871 -p 9872:9872 -p 9873:9873 -p 9874:9874 --shm-size="16G" -d breakstring/gpt-sovits:xxxxx
```
## ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ
[GPT-SoVITS Models](https://huggingface.co/lj1995/GPT-SoVITS)์—์„œ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  `GPT_SoVITS\pretrained_models`์— ๋„ฃ์Šต๋‹ˆ๋‹ค.
์ค‘๊ตญ์–ด ์ž๋™ ์Œ์„ฑ ์ธ์‹(ASR), ์Œ์„ฑ ๋ฐ˜์ฃผ ๋ถ„๋ฆฌ ๋ฐ ์Œ์„ฑ ์ œ๊ฑฐ๋ฅผ ์œ„ํ•ด [Damo ASR Model](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files), [Damo VAD Model](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files) ๋ฐ [Damo Punc Model](https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files)์„ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  `tools/asr/models`์— ๋„ฃ์Šต๋‹ˆ๋‹ค.
UVR5(์Œ์„ฑ/๋ฐ˜์ฃผ ๋ถ„๋ฆฌ ๋ฐ ์ž”ํ–ฅ ์ œ๊ฑฐ)๋ฅผ ์œ„ํ•ด [UVR5 Weights](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights)์—์„œ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  `tools/uvr5/uvr5_weights`์— ๋„ฃ์Šต๋‹ˆ๋‹ค.
## ๋ฐ์ดํ„ฐ์…‹ ํ˜•์‹
ํ…์ŠคํŠธ ์Œ์„ฑ ํ•ฉ์„ฑ(TTS) ์ฃผ์„ .list ํŒŒ์ผ ํ˜•์‹:
```
vocal_path|speaker_name|language|text
```
์–ธ์–ด ์‚ฌ์ „:
- 'zh': ์ค‘๊ตญ์–ด
- 'ja': ์ผ๋ณธ์–ด
- 'en': ์˜์–ด
์˜ˆ์‹œ:
```
D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
```
## ํ•  ์ผ ๋ชฉ๋ก
- [ ] **์ตœ์šฐ์„ ์ˆœ์œ„:**
- [x] ์ผ๋ณธ์–ด ๋ฐ ์˜์–ด ์ง€์—ญํ™”.
- [ ] ์‚ฌ์šฉ์ž ๊ฐ€์ด๋“œ.
- [x] ์ผ๋ณธ์–ด ๋ฐ ์˜์–ด ๋ฐ์ดํ„ฐ์…‹ ๋ฏธ์„ธ ์กฐ์ • ํ›ˆ๋ จ.
- [ ] **๊ธฐ๋Šฅ:**
- [ ] ์ œ๋กœ์ƒท ์Œ์„ฑ ๋ณ€ํ™˜ (5์ดˆ) / ์†Œ๋Ÿ‰์˜ ์Œ์„ฑ ๋ณ€ํ™˜ (1๋ถ„).
- [ ] TTS ์†๋„ ์ œ์–ด.
- [ ] ํ–ฅ์ƒ๋œ TTS ๊ฐ์ • ์ œ์–ด.
- [ ] SoVITS ํ† ํฐ ์ž…๋ ฅ์„ ๋‹จ์–ด ํ™•๋ฅ  ๋ถ„ํฌ๋กœ ๋ณ€๊ฒฝํ•ด ๋ณด์„ธ์š”.
- [ ] ์˜์–ด ๋ฐ ์ผ๋ณธ์–ด ํ…์ŠคํŠธ ํ”„๋ก ํŠธ ์—”๋“œ ๊ฐœ์„ .
- [ ] ์ž‘์€ ํฌ๊ธฐ์™€ ํฐ ํฌ๊ธฐ์˜ TTS ๋ชจ๋ธ ๊ฐœ๋ฐœ.
- [x] Colab ์Šคํฌ๋ฆฝํŠธ.
- [ ] ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹ ํ™•์žฅ (2k ์‹œ๊ฐ„์—์„œ 10k ์‹œ๊ฐ„).
- [ ] ๋” ๋‚˜์€ sovits ๊ธฐ๋ณธ ๋ชจ๋ธ (ํ–ฅ์ƒ๋œ ์˜ค๋””์˜ค ํ’ˆ์งˆ).
- [ ] ๋ชจ๋ธ ๋ธ”๋ Œ๋”ฉ.
## (์ถ”๊ฐ€์ ์ธ) ๋ช…๋ น์ค„์—์„œ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•
๋ช…๋ น์ค„์„ ์‚ฌ์šฉํ•˜์—ฌ UVR5์šฉ WebUI ์—ด๊ธฐ
```
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```
๋ธŒ๋ผ์šฐ์ €๋ฅผ ์—ด ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ UVR ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด ์•„๋ž˜ ํ˜•์‹์„ ๋”ฐ๋ฅด์‹ญ์‹œ์˜ค. ์ด๋Š” ์˜ค๋””์˜ค ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด mdxnet์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
```
python mdxnet.py --model --input_root --output_vocal --output_ins --agg_level --format --device --is_half_precision
```
๋ช…๋ น์ค„์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์˜ค๋””์˜ค ๋ถ„ํ• ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
```
python audio_slicer.py \
--input_path "<path_to_original_audio_file_or_directory>" \
--output_root "<directory_where_subdivided_audio_clips_will_be_saved>" \
--threshold <volume_threshold> \
--min_length <minimum_duration_of_each_subclip> \
--min_interval <shortest_time_gap_between_adjacent_subclips>
--hop_size <step_size_for_computing_volume_curve>
```
๋ช…๋ น์ค„์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์„ธํŠธ ASR ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค(์ค‘๊ตญ์–ด๋งŒ ํ•ด๋‹น).
```
python tools/asr/funasr_asr.py -i <input> -o <output>
```
ASR ์ฒ˜๋ฆฌ๋Š” Faster_Whisper(์ค‘๊ตญ์–ด๋ฅผ ์ œ์™ธํ•œ ASR ๋งˆํ‚น)๋ฅผ ํ†ตํ•ด ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค.
(์ง„ํ–‰๋ฅ  ํ‘œ์‹œ์ค„ ์—†์Œ, GPU ์„ฑ๋Šฅ์œผ๋กœ ์ธํ•ด ์‹œ๊ฐ„ ์ง€์—ฐ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Œ)
```
python ./tools/asr/fasterwhisper_asr.py -i <input> -o <output> -l <language> -p <precision>
```
์‚ฌ์šฉ์ž ์ •์˜ ๋ชฉ๋ก ์ €์žฅ ๊ฒฝ๋กœ๊ฐ€ ํ™œ์„ฑํ™”๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
## ๊ฐ์‚ฌ์˜ ๋ง
๋‹ค์Œ ํ”„๋กœ์ ํŠธ์™€ ๊ธฐ์—ฌ์ž๋“ค์—๊ฒŒ ํŠน๋ณ„ํžˆ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค:
### ์ด๋ก  ์—ฐ๊ตฌ
- [ar-vits](https://github.com/innnky/ar-vits)
- [SoundStorm](https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR)
- [vits](https://github.com/jaywalnut310/vits)
- [TransferTTS](https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556)
- [contentvec](https://github.com/auspicious3000/contentvec/)
- [hifi-gan](https://github.com/jik876/hifi-gan)
- [fish-speech](https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41)
### ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ
- [Chinese Speech Pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
- [Chinese-Roberta-WWM-Ext-Large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large)
### ์ถ”๋ก ์šฉ ํ…์ŠคํŠธ ํ”„๋ก ํŠธ์—”๋“œ
- [paddlespeech zh_normalization](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization)
- [LangSegment](https://github.com/juntaosun/LangSegment)
### WebUI ๋„๊ตฌ
- [ultimatevocalremovergui](https://github.com/Anjok07/ultimatevocalremovergui)
- [audio-slicer](https://github.com/openvpi/audio-slicer)
- [SubFix](https://github.com/cronrpc/SubFix)
- [FFmpeg](https://github.com/FFmpeg/FFmpeg)
- [gradio](https://github.com/gradio-app/gradio)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [FunASR](https://github.com/alibaba-damo-academy/FunASR)
## ๋ชจ๋“  ๊ธฐ์—ฌ์ž๋“ค์—๊ฒŒ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค ;)
<a href="https://github.com/RVC-Boss/GPT-SoVITS/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Boss/GPT-SoVITS" />
</a>