Adel-Moumen
commited on
Commit
•
e8ff4dd
1
Parent(s):
2b8af03
typos
Browse files
README.md
CHANGED
@@ -73,13 +73,13 @@ widget:
|
|
73 |
|
74 |
# Language Identification from Speech Recordings with ECAPA embeddings on CommonLanguage
|
75 |
|
76 |
-
This repository provides all the necessary tools to perform language identification from
|
77 |
The system uses a model pretrained on the CommonLanguage dataset (45 languages).
|
78 |
You can download the dataset [here](https://zenodo.org/record/5036977#.YNzDbXVKg5k)
|
79 |
The provided system can recognize the following 45 languages from short speech recordings:
|
80 |
|
81 |
```
|
82 |
-
Arabic, Basque, Breton, Catalan, Chinese_China, Chinese_Hongkong, Chinese_Taiwan, Chuvash, Czech, Dhivehi, Dutch, English, Esperanto, Estonian, French, Frisian, Georgian, German, Greek, Hakha_Chin, Indonesian, Interlingua, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Maltese,
|
83 |
```
|
84 |
|
85 |
For a better experience, we encourage you to learn more about
|
@@ -91,7 +91,7 @@ For a better experience, we encourage you to learn more about
|
|
91 |
|
92 |
|
93 |
## Pipeline description
|
94 |
-
This system is composed of
|
95 |
|
96 |
The system is trained with recordings sampled at 16kHz (single channel).
|
97 |
The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed. Make sure your input tensor is compliant with the expected sampling rate if you use *encode_batch* and *classify_batch*.
|
@@ -127,7 +127,7 @@ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling
|
|
127 |
|
128 |
### Training
|
129 |
The model was trained with SpeechBrain (a02f860e).
|
130 |
-
To train it from scratch
|
131 |
1. Clone SpeechBrain:
|
132 |
```bash
|
133 |
git clone https://github.com/speechbrain/speechbrain/
|
|
|
73 |
|
74 |
# Language Identification from Speech Recordings with ECAPA embeddings on CommonLanguage
|
75 |
|
76 |
+
This repository provides all the necessary tools to perform language identification from speech recordings with SpeechBrain.
|
77 |
The system uses a model pretrained on the CommonLanguage dataset (45 languages).
|
78 |
You can download the dataset [here](https://zenodo.org/record/5036977#.YNzDbXVKg5k)
|
79 |
The provided system can recognize the following 45 languages from short speech recordings:
|
80 |
|
81 |
```
|
82 |
+
Arabic, Basque, Breton, Catalan, Chinese_China, Chinese_Hongkong, Chinese_Taiwan, Chuvash, Czech, Dhivehi, Dutch, English, Esperanto, Estonian, French, Frisian, Georgian, German, Greek, Hakha_Chin, Indonesian, Interlingua, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Maltese, Mongolian, Persian, Polish, Portuguese, Romanian, Romansh_Sursilvan, Russian, Sakha, Slovenian, Spanish, Swedish, Tamil, Tatar, Turkish, Ukrainian, Welsh
|
83 |
```
|
84 |
|
85 |
For a better experience, we encourage you to learn more about
|
|
|
91 |
|
92 |
|
93 |
## Pipeline description
|
94 |
+
This system is composed of an ECAPA model coupled with statistical pooling. A classifier, trained with Categorical Cross-Entropy Loss, is applied on top of that.
|
95 |
|
96 |
The system is trained with recordings sampled at 16kHz (single channel).
|
97 |
The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed. Make sure your input tensor is compliant with the expected sampling rate if you use *encode_batch* and *classify_batch*.
|
|
|
127 |
|
128 |
### Training
|
129 |
The model was trained with SpeechBrain (a02f860e).
|
130 |
+
To train it from scratch follow these steps:
|
131 |
1. Clone SpeechBrain:
|
132 |
```bash
|
133 |
git clone https://github.com/speechbrain/speechbrain/
|