sarulab-speech
/

UTDUSS-Vocoder

Model card Files Files and versions Community

UTDUSS-Vocoder / README.md

Wataru's picture

Update README.md

967b37f verified 6 months ago

|

No virus

1.91 kB

	---
	license: cc-by-nc-4.0
	---

	# UTDUSS vocodder model
	In this repo, we provide model weight of the [descript audio codec](https://arxiv.org/abs/2306.06546) used for the [Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge](https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/)
	# Prerequesties

	[official dac library](https://github.com/descriptinc/descript-audio-codec) which can be installed with the following command.
	```bash
	pip install descript-audio-codec
	```

	# Provided weights

	## Vocoder task
	\| model name on paper \| model name on this repo \|
	\|---\|---\|
	\|😀 \| expresso_16k_2code.pth\|
	\|😀 w/o hyper-parameter tuning\| expresso_16k_2code_official.pth\|
	\|😀 w/o data exclusion\| expresso_16k_2code_wo_data.pth\|
	\|😀 w/o matching sampling rate\| expresso_24k_2code_ab.pth\|

	## Acoustic +Vocoder (TTS) task
	Please note that the weight for acoustic model is not provided.

	### Full training set
	\| model name on paper \| model name on this repo \|
	\|---\|---\|
	\|Discrete-TTS v1, v1.1 \| lj_16k_1code.pth\|
	\|Discrete-TTS v2, v2.2\| lj_16k_1code_512.pth\|
	\|Discrete-TTS v3\| lj_16k_1code_256.pth\|
	### 1h training set
	\| model name on paper \| model name on this repo \|
	\|---\|---\|
	\|Discrete-TTS v1, v1.1 \| lj_1h_16k_1code.pth\|
	\|Discrete-TTS v2, v2.2\| lj_1h_16k_1code_512.pth\|
	\|Discrete-TTS v3\| lj_1h_16k_1code_256.pth\|

	# Sample code
	# Sample code

	```python
	import dac
	import torch
	from pathlib import Path
	model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder/resolve/main/expresso_16k_2code.pth"
	model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}")
	model_path.parent.mkdir(parents=True,exist_ok=True)
	torch.hub.download_url_to_file(model_url,model_path)
	model = dac.DAC.load(model_path)
	```

	# Contributors
	* [Wataru Nakata](https://wataru-nakata.github.io/)
	* Kazuki Yamauchi
	* Dong Yang
	* Hiroaki Hyodo
	* [Yuki Saito](https://sython.org)