wyz commited on
Commit
7405552
1 Parent(s): 1586734

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -9
README.md CHANGED
@@ -4,8 +4,6 @@ tags:
4
  - audio
5
  - audio-to-audio
6
  language: en
7
- datasets:
8
- - universal_se
9
  license: cc-by-4.0
10
  ---
11
 
@@ -13,18 +11,35 @@ license: cc-by-4.0
13
 
14
  ### `wyz/vctk_bsrnn_xtiny_causal`
15
 
16
- This model was trained by Emrys365 using universal_se recipe in [espnet](https://github.com/espnet/espnet/).
17
 
18
  ### Demo: How to use in ESPnet2
19
 
20
  Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
  if you haven't done that already.
22
 
23
- ```bash
24
- cd espnet
25
- pip install -e .
26
- cd egs2/universal_se/enh1
27
- ./run.sh --skip_data_prep false --skip_train true --download_model wyz/vctk_bsrnn_xtiny_causal
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  ```
29
 
30
 
@@ -323,4 +338,4 @@ or arXiv:
323
  archivePrefix={arXiv},
324
  primaryClass={cs.CL}
325
  }
326
- ```
 
4
  - audio
5
  - audio-to-audio
6
  language: en
 
 
7
  license: cc-by-4.0
8
  ---
9
 
 
11
 
12
  ### `wyz/vctk_bsrnn_xtiny_causal`
13
 
14
+ This model was trained by Emrys365 based on the universal_se_v1 recipe in [espnet](https://github.com/espnet/espnet/).
15
 
16
  ### Demo: How to use in ESPnet2
17
 
18
  Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
19
  if you haven't done that already.
20
 
21
+ To use the model in the Python interface, you could use the following code:
22
+
23
+ ```python
24
+ import soundfile as sf
25
+ from espnet2.bin.enh_inference import SeparateSpeech
26
+
27
+ # For model downloading + loading
28
+ model = SeparateSpeech.from_pretrained(
29
+ model_tag="wyz/vctk_bsrnn_xtiny_causal",
30
+ normalize_output_wav=True,
31
+ device="cuda",
32
+ )
33
+ # For loading a downloaded model
34
+ # model = SeparateSpeech(
35
+ # train_config="exp_vctk/xxx/config.yaml",
36
+ # model_file="exp_vctk/xx/xxxx.pth",
37
+ # normalize_output_wav=True,
38
+ # device="cuda",
39
+ # )
40
+
41
+ audio, fs = sf.read("/path/to/noisy/utt1.flac")
42
+ enhanced = model(audio[None, :], fs=fs)[0]
43
  ```
44
 
45
 
 
338
  archivePrefix={arXiv},
339
  primaryClass={cs.CL}
340
  }
341
+ ```