tnkmr
/

sfi_convtasnet_td_mpgtf_musdb18hq

Model card Files Files and versions Community

sfi_convtasnet_td_mpgtf_musdb18hq / README.md

tnkmr's picture

Update README.md

dc31fc4 over 1 year ago

|

history blame contribute delete

No virus

1.37 kB

	---
	license: mit
	language:
	- ja
	tags:
	- music
	- audio
	- audio-to-audio
	- SFI
	datasets:
	- MUSDB18-HQ
	metrics:
	- SDR
	---

	# Sampling-frequency-independent (SFI) Conv-TasNet trained with the MUSDB18-HQ dataset for music source separation
	This model was proposed in [our IEEE/ACM Trans. ASLP paper](https://doi.org/10.1109/TASLP.2022.3203907) and works well with untrained sampling frequencies by using sampling-frequency-independent convolutional layers with the time domain filter design.
	The latent analog filter is a multiphase gammatone filter.
	It was trained by Tomohiko Nakamura using [the codebase](https://github.com/TomohikoNakamura/sfi_convtasnet)).
	This model was trained with 32 kHz-sampled data but works well with untrained sampling frequencies (e.g., 8, 16 kHz).

	# License
	MIT

	# Citation
	Please cite the following paper.
	```
	@article{KSaito2022IEEEACMTASLP,
	author={Saito, Koichi and Nakamura, Tomohiko and Yatabe, Kohei and Saruwatari, Hiroshi},
	journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
	title = {Sampling-frequency-independent convolutional layer and its application to audio source separation},
	year=2022,
	month=sep,
	volume=30,
	pages={2928--2943},
	doi={10.1109/TASLP.2022.3203907},
	}
	```

	# Contents
	- Four trained models (seed=40,42,44,47)
	- Evaluation results (json files obtained with the museval library)