blackstone
commited on
Commit
•
cfd85fd
1
Parent(s):
ab93251
Update README.md
Browse files
README.md
CHANGED
@@ -24,14 +24,13 @@ pipeline_tag: audio-classification
|
|
24 |
<iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
|
25 |
<br/><br/>
|
26 |
|
27 |
-
# Speaker Verification with ECAPA-TDNN
|
28 |
|
29 |
-
This repository
|
30 |
The system can be used to extract speaker embeddings as well.
|
31 |
It is trained on CNCeleb1 + CNCeleb2 training data.
|
32 |
|
33 |
-
|
34 |
-
[SpeechBrain](https://speechbrain.github.io). The model performance on CNCeleb1-test set(Cleaned) is:
|
35 |
|
36 |
| Release | EER(%) | MinDCF(p=0.01) |
|
37 |
|:-------------:|:--------------:|:--------------:|
|
@@ -41,15 +40,16 @@ For a better experience, we encourage you to learn more about
|
|
41 |
## Pipeline description
|
42 |
|
43 |
This system is composed of an ECAPA-TDNN model. It is a combination of convolutional and residual blocks. The embeddings are extracted using attentive statistical pooling. The system is trained with Additive Margin Softmax Loss. Speaker Verification is performed using cosine distance between speaker embeddings.
|
44 |
-
You can find our training results (models, logs, etc) [here](
|
45 |
|
46 |
### Compute your speaker embeddings
|
47 |
|
48 |
```python
|
49 |
import torchaudio
|
50 |
from speechbrain.pretrained import EncoderClassifier
|
51 |
-
classifier = EncoderClassifier.from_hparams(source="
|
52 |
-
|
|
|
53 |
embeddings = classifier.encode_batch(signal)
|
54 |
```
|
55 |
The system is trained with recordings sampled at 16kHz (single channel).
|
@@ -59,7 +59,7 @@ The code will automatically normalize your audio (i.e., resampling + mono channe
|
|
59 |
|
60 |
```python
|
61 |
from speechbrain.pretrained import SpeakerRecognition
|
62 |
-
verification = SpeakerRecognition.from_hparams(source="
|
63 |
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
|
64 |
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker
|
65 |
```
|
|
|
24 |
<iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
|
25 |
<br/><br/>
|
26 |
|
27 |
+
# Speaker Verification with ECAPA-TDNN on CNCeleb
|
28 |
|
29 |
+
This repository a pretrained ECAPA-TDNN model using SpeechBrain.
|
30 |
The system can be used to extract speaker embeddings as well.
|
31 |
It is trained on CNCeleb1 + CNCeleb2 training data.
|
32 |
|
33 |
+
The model performance on CNCeleb1-test set(Cleaned) is:
|
|
|
34 |
|
35 |
| Release | EER(%) | MinDCF(p=0.01) |
|
36 |
|:-------------:|:--------------:|:--------------:|
|
|
|
40 |
## Pipeline description
|
41 |
|
42 |
This system is composed of an ECAPA-TDNN model. It is a combination of convolutional and residual blocks. The embeddings are extracted using attentive statistical pooling. The system is trained with Additive Margin Softmax Loss. Speaker Verification is performed using cosine distance between speaker embeddings.
|
43 |
+
You can find our training results (models, logs, etc) [here]().
|
44 |
|
45 |
### Compute your speaker embeddings
|
46 |
|
47 |
```python
|
48 |
import torchaudio
|
49 |
from speechbrain.pretrained import EncoderClassifier
|
50 |
+
classifier = EncoderClassifier.from_hparams(source="blackstone/spkrec-ecapa-cnceleb")
|
51 |
+
|
52 |
+
signal, fs = torchaudio.load('tests/samples/ASR/spk1_snt1.wav')
|
53 |
embeddings = classifier.encode_batch(signal)
|
54 |
```
|
55 |
The system is trained with recordings sampled at 16kHz (single channel).
|
|
|
59 |
|
60 |
```python
|
61 |
from speechbrain.pretrained import SpeakerRecognition
|
62 |
+
verification = SpeakerRecognition.from_hparams(source="blackstone/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-cnceleb")
|
63 |
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
|
64 |
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker
|
65 |
```
|