metadata

license: cc-by-nc-4.0

UTDUSS vocodder model

In this repo, we provide model weight of the descript audio codec used for the Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge

Prerequesties

official dac library which can be installed with the following command.

pip install descript-audio-codec

Provided weights

Vocoder task

model name on paper	model name on this repo
😀	expresso_16k_2code.pth
😀 w/o hyper-parameter tuning	expresso_16k_2code_official.pth
😀 w/o data exclusion	expresso_16k_2code_wo_data.pth
😀 w/o matching sampling rate	expresso_24k_2code_ab.pth

Acoustic +Vocoder (TTS) task

Please note that the weight for acoustic model is not provided.

model name on paper	model name on this repo
😀	expresso_16k_2code.pth
😀 w/o hyper-parameter tuning	expresso_16k_2code_official.pth
😀 w/o data exclusion	expresso_16k_2code_wo_data.pth
😀 w/o matching sampling rate	expresso_24k_2code_ab.pth

Sample code

import dac
import torch
from pathlib import Path
model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder/resolve/main/expresso_16k_2code.pth"
model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}")
model_path.parent.mkdir(parents=True,exist_ok=True)
torch.hub.download_url_to_file(model_url,model_path)
model = dac.DAC.load(model_path)

sarulab-speech
/

UTDUSS-Vocoder

UTDUSS vocodder model

Prerequesties

Provided weights

Vocoder task

Acoustic +Vocoder (TTS) task

Sample code

Contributors