|
--- |
|
license: cc-by-nc-4.0 |
|
--- |
|
|
|
# UTDUSS vocodder model |
|
In this repo, we provide model weight of the [descript audio codec](https://arxiv.org/abs/2306.06546) used for the [Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge](https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/) |
|
# Prerequesties |
|
|
|
[official dac library](https://github.com/descriptinc/descript-audio-codec) which can be installed with the following command. |
|
```bash |
|
pip install descript-audio-codec |
|
``` |
|
|
|
# Provided weights |
|
|
|
## Vocoder task |
|
| model name on paper | model name on this repo | |
|
|---|---| |
|
|π | expresso_16k_2code.pth| |
|
|π w/o hyper-parameter tuning| expresso_16k_2code_official.pth| |
|
|π w/o data exclusion| expresso_16k_2code_wo_data.pth| |
|
|π w/o matching sampling rate| expresso_24k_2code_ab.pth| |
|
|
|
## Acoustic +Vocoder (TTS) task |
|
Please note that the weight for acoustic model is not provided. |
|
|
|
### Full training set |
|
| model name on paper | model name on this repo | |
|
|---|---| |
|
|Discrete-TTS v1, v1.1 | lj_16k_1code.pth| |
|
|Discrete-TTS v2, v2.2| lj_16k_1code_512.pth| |
|
|Discrete-TTS v3| lj_16k_1code_256.pth| |
|
### 1h training set |
|
| model name on paper | model name on this repo | |
|
|---|---| |
|
|Discrete-TTS v1, v1.1 | lj_1h_16k_1code.pth| |
|
|Discrete-TTS v2, v2.2| lj_1h_16k_1code_512.pth| |
|
|Discrete-TTS v3| lj_1h_16k_1code_256.pth| |
|
|
|
# Sample code |
|
# Sample code |
|
|
|
```python |
|
import dac |
|
import torch |
|
from pathlib import Path |
|
model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder/resolve/main/expresso_16k_2code.pth" |
|
model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}") |
|
model_path.parent.mkdir(parents=True,exist_ok=True) |
|
torch.hub.download_url_to_file(model_url,model_path) |
|
model = dac.DAC.load(model_path) |
|
``` |
|
|
|
# Contributors |
|
* [Wataru Nakata](https://wataru-nakata.github.io/) |
|
* Kazuki Yamauchi |
|
* Dong Yang |
|
* Hiroaki Hyodo |
|
* [Yuki Saito](https://sython.org) |