--- license: cc-by-nc-4.0 --- # UTDUSS vocodder model In this repo, we provide model weight of the [descript audio codec](https://arxiv.org/abs/2306.06546) used for the [Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge](https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/) # Prerequesties [official dac library](https://github.com/descriptinc/descript-audio-codec) which can be installed with the following command. ```bash pip install descript-audio-codec ``` # Provided weights ## Vocoder task | model name on paper | model name on this repo | |---|---| |😀 | expresso_16k_2code.pth| |😀 w/o hyper-parameter tuning| expresso_16k_2code_official.pth| |😀 w/o data exclusion| expresso_16k_2code_wo_data.pth| |😀 w/o matching sampling rate| expresso_24k_2code_ab.pth| ## Acoustic +Vocoder (TTS) task Please note that the weight for acoustic model is not provided. | model name on paper | model name on this repo | |---|---| |😀 | expresso_16k_2code.pth| |😀 w/o hyper-parameter tuning| expresso_16k_2code_official.pth| |😀 w/o data exclusion| expresso_16k_2code_wo_data.pth| |😀 w/o matching sampling rate| expresso_24k_2code_ab.pth| # Sample code ```python import dac import torch from pathlib import Path model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder/resolve/main/expresso_16k_2code.pth" model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}") model_path.parent.mkdir(parents=True,exist_ok=True) torch.hub.download_url_to_file(model_url,model_path) model = dac.DAC.load(model_path) ``` # Contributors * [Wataru Nakata](https://wataru-nakata.github.io/) * [Kazuki Yamauchi]() * [Dong Yang]() * [Hiroaki Hyodo]() * [Yuki Saito](https://sython.org)