File size: 1,891 Bytes
20d5bdb
 
 
3768f68
8d6d3af
3768f68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
967b37f
3768f68
 
967b37f
3768f68
 
967b37f
 
 
 
 
 
 
 
 
3768f68
 
 
 
 
 
 
ede0b98
3768f68
 
 
 
 
 
 
eba1cf2
967b37f
 
 
eba1cf2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: cc-by-nc-4.0
---

# UTDUSS vocoder model
In this repo, we provide model weight of the [descript audio codec](https://arxiv.org/abs/2306.06546) used for the [Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge](https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/)
# Prerequesties

[official dac library](https://github.com/descriptinc/descript-audio-codec) which can be installed with the following command.
```bash
pip install descript-audio-codec
```

# Provided weights

## Vocoder task
| model name on paper | model name on this repo |
|---|---|
|πŸ˜€ | expresso_16k_2code.pth|
|πŸ˜€ w/o hyper-parameter tuning| expresso_16k_2code_official.pth|
|πŸ˜€ w/o data exclusion| expresso_16k_2code_wo_data.pth|
|πŸ˜€ w/o matching sampling rate| expresso_24k_2code_ab.pth|

## Acoustic +Vocoder (TTS) task 
Please note that the weight for acoustic model is not provided.

### Full training set
| model name on paper | model name on this repo |
|---|---|
|Discrete-TTS v1, v1.1 | lj_16k_1code.pth|
|Discrete-TTS v2, v2.2| lj_16k_1code_512.pth|
|Discrete-TTS v3| lj_16k_1code_256.pth|
### 1h training set
| model name on paper | model name on this repo |
|---|---|
|Discrete-TTS v1, v1.1 | lj_1h_16k_1code.pth|
|Discrete-TTS v2, v2.2| lj_1h_16k_1code_512.pth|
|Discrete-TTS v3| lj_1h_16k_1code_256.pth|

# Sample code

```python
import dac
import torch
from pathlib import Path
model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder/resolve/main/expresso_16k_2code.pth"
model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}")
model_path.parent.mkdir(parents=True,exist_ok=True)
torch.hub.download_url_to_file(model_url,model_path)
model = dac.DAC.load(model_path)
```

# Contributors
* [Wataru Nakata](https://wataru-nakata.github.io/)
* Kazuki Yamauchi
* Dong Yang
* Hiroaki Hyodo
* [Yuki Saito](https://sython.org)