keithhon commited on
Commit
3f3a339
1 Parent(s): c37a507

Upload samples/VCTK.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. samples/VCTK.txt +94 -0
samples/VCTK.txt ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---------------------------------------------------------------------
2
+ CSTR VCTK Corpus
3
+ English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
4
+
5
+ (Version 0.92)
6
+ RELEASE September 2019
7
+ The Centre for Speech Technology Research
8
+ University of Edinburgh
9
+ Copyright (c) 2019
10
+
11
+ Junichi Yamagishi
12
13
+ ---------------------------------------------------------------------
14
+
15
+ Overview
16
+
17
+ This CSTR VCTK Corpus includes speech data uttered by 110 English
18
+ speakers with various accents. Each speaker reads out about 400
19
+ sentences, which were selected from a newspaper, the rainbow passage
20
+ and an elicitation paragraph used for the speech accent archive.
21
+
22
+ The newspaper texts were taken from Herald Glasgow, with permission
23
+ from Herald & Times Group. Each speaker has a different set of the
24
+ newspaper texts selected based a greedy algorithm that increases the
25
+ contextual and phonetic coverage. The details of the text selection
26
+ algorithms are described in the following paper:
27
+
28
+ C. Veaux, J. Yamagishi and S. King,
29
+ "The voice bank corpus: Design, collection and data analysis of
30
+ a large regional accent speech database,"
31
+ https://doi.org/10.1109/ICSDA.2013.6709856
32
+
33
+ The rainbow passage and elicitation paragraph are the same for all
34
+ speakers. The rainbow passage can be found at International Dialects
35
+ of English Archive:
36
+ (http://web.ku.edu/~idea/readings/rainbow.htm). The elicitation
37
+ paragraph is identical to the one used for the speech accent archive
38
+ (http://accent.gmu.edu). The details of the the speech accent archive
39
+ can be found at
40
+ http://www.ualberta.ca/~aacl2009/PDFs/WeinbergerKunath2009AACL.pdf
41
+
42
+ All speech data was recorded using an identical recording setup: an
43
+ omni-directional microphone (DPA 4035) and a small diaphragm condenser
44
+ microphone with very wide bandwidth (Sennheiser MKH 800), 96kHz
45
+ sampling frequency at 24 bits and in a hemi-anechoic chamber of
46
+ the University of Edinburgh. (However, two speakers, p280 and p315
47
+ had technical issues of the audio recordings using MKH 800).
48
+ All recordings were converted into 16 bits, were downsampled to
49
+ 48 kHz, and were manually end-pointed.
50
+
51
+ This corpus was originally aimed for HMM-based text-to-speech synthesis
52
+ systems, especially for speaker-adaptive HMM-based speech synthesis
53
+ that uses average voice models trained on multiple speakers and speaker
54
+ adaptation technologies. This corpus is also suitable for DNN-based
55
+ multi-speaker text-to-speech synthesis systems and waveform modeling.
56
+
57
+ COPYING
58
+
59
+ This corpus is licensed under the Creative Commons License: Attribution 4.0 International
60
+ http://creativecommons.org/licenses/by/4.0/legalcode
61
+
62
+ VCTK VARIANTS
63
+ There are several variants of the VCTK corpus:
64
+ Speech enhancement
65
+ - Noisy speech database for training speech enhancement algorithms and TTS models where we added various types of noises to VCTK artificially: http://dx.doi.org/10.7488/ds/2117
66
+ - Reverberant speech database for training speech dereverberation algorithms and TTS models where we added various types of reverberantion to VCTK artificially http://dx.doi.org/10.7488/ds/1425
67
+ - Noisy reverberant speech database for training speech enhancement algorithms and TTS models http://dx.doi.org/10.7488/ds/2139
68
+ - Device Recorded VCTK where speech signals of the VCTK corpus were played back and re-recorded in office environments using relatively inexpensive consumer devices http://dx.doi.org/10.7488/ds/2316
69
+ - The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) https://github.com/microsoft/MS-SNSD
70
+
71
+ ASV and anti-spoofing
72
+ - Spoofing and Anti-Spoofing (SAS) corpus, which is a collection of synthetic speech signals produced by nine techniques, two of which are speech synthesis, and seven are voice conversion. All of them were built using the VCTK corpus. http://dx.doi.org/10.7488/ds/252
73
+ - Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015) Database. This database consists of synthetic speech signals produced by ten techniques and this has been used in the first Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015) http://dx.doi.org/10.7488/ds/298
74
+ - ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database. This database has been used in the 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2019) https://doi.org/10.7488/ds/2555
75
+
76
+
77
+ ACKNOWLEDGEMENTS
78
+
79
+ The CSTR VCTK Corpus was constructed by:
80
+
81
+ Christophe Veaux (University of Edinburgh)
82
+ Junichi Yamagishi (University of Edinburgh)
83
+ Kirsten MacDonald
84
+
85
+ The research leading to these results was partly funded from EPSRC
86
+ grants EP/I031022/1 (NST) and EP/J002526/1 (CAF), from the RSE-NSFC
87
+ grant (61111130120), and from the JST CREST (uDialogue).
88
+
89
+ Please cite this corpus as follows:
90
+ Christophe Veaux, Junichi Yamagishi, Kirsten MacDonald,
91
+ "CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit",
92
+ The Centre for Speech Technology Research (CSTR),
93
+ University of Edinburgh
94
+