JunzheJosephZhu
/

MultiDecoderDPRNN

MultiDecoderDPRNN

Model card Files Files and versions Community

JunzheJosephZhu commited on Jun 7, 2021

Commit

99e9b7b

•

1 Parent(s): 7093ec8

readme

Files changed (1) hide show

README.md +65 -0

README.md ADDED Viewed

	@@ -0,0 +1,65 @@

+---
+tags:
+- asteroid
+- audio
+- MultiDecoderDPRNN
+datasets:
+- Wsj0MixVar
+- sep_clean
+inference: false
+---
+## Asteroid model
+## Description:
+Refer to paper "Multi-Decoder DPRNN: High Accuracy Source Counting and Separation",
+        Junzhe Zhu, Raymond Yeh, Mark Hasegawa-Johnson. https://arxiv.org/abs/2011.12022
+Demo Page: https://junzhejosephzhu.github.io/Multi-Decoder-DPRNN/
+Original research repo is at https://github.com/JunzheJosephZhu/MultiDecoder-DPRNN
+This model was trained by Joseph Zhu using the wsj0-mix-var/Multi-Decoder-DPRNN recipe in Asteroid.
+It was trained on the `sep_clean` task of the Wsj0MixVar dataset.
+## Training config:
+```yaml
+filterbank:
+  n_filters: 64
+  kernel_size: 8
+  stride: 4
+masknet:
+  n_srcs: [2, 3, 4, 5]
+  bn_chan: 128
+  hid_size: 128
+  chunk_size: 128
+  hop_size: 64
+  n_repeats: 8
+  mask_act: 'sigmoid'
+  bidirectional: true
+  dropout: 0
+  use_mulcat: false
+training:
+  epochs: 200
+  batch_size: 2
+  num_workers: 2
+  half_lr: yes
+  lr_decay: yes
+  early_stop: yes
+  gradient_clipping: 5
+optim:
+  optimizer: adam
+  lr: 0.001
+  weight_decay: 0.00000
+data:
+  train_dir: "data/{}speakers/wav8k/min/tr"
+  valid_dir: "data/{}speakers/wav8k/min/cv"
+  task: sep_clean
+  sample_rate: 8000
+  seglen: 4.0
+  minlen: 2.0
+loss:
+  lambda: 0.05
+```
+## Results:
+```yaml
+tmux attach -t 2
+```