speechbrain
/

lang-id-commonlanguage_ecapa

@@ -73,13 +73,13 @@ widget:
 # Language Identification from Speech Recordings with ECAPA embeddings on CommonLanguage
-This repository provides all the necessary tools to perform language identification from speeech recordinfs with SpeechBrain.
 The system uses a model pretrained on the CommonLanguage dataset (45 languages).
 You can download the dataset [here](https://zenodo.org/record/5036977#.YNzDbXVKg5k)
 The provided system can recognize the following 45 languages from short speech recordings:
 ```
-Arabic, Basque, Breton, Catalan, Chinese_China, Chinese_Hongkong, Chinese_Taiwan, Chuvash, Czech, Dhivehi, Dutch, English, Esperanto, Estonian, French, Frisian, Georgian, German, Greek, Hakha_Chin, Indonesian, Interlingua, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Maltese, Mangolian, Persian, Polish, Portuguese, Romanian, Romansh_Sursilvan, Russian, Sakha, Slovenian, Spanish, Swedish, Tamil, Tatar, Turkish, Ukranian, Welsh
 ```
 For a better experience, we encourage you to learn more about
@@ -91,7 +91,7 @@ For a better experience, we encourage you to learn more about
 ## Pipeline description
-This system is composed of a ECAPA model coupled with statistical pooling. A classifier, trained with Categorical Cross-Entropy Loss, is applied on top of that.
 The system is trained with recordings sampled at 16kHz (single channel).
 The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed. Make sure your input tensor is compliant with the expected sampling rate if you use *encode_batch* and *classify_batch*.
@@ -127,7 +127,7 @@ To perform inference on the GPU, add  `run_opts={"device":"cuda"}`  when calling
 ### Training
 The model was trained with SpeechBrain (a02f860e).
-To train it from scratch follows these steps:
 1. Clone SpeechBrain:
 ```bash
 git clone https://github.com/speechbrain/speechbrain/

 # Language Identification from Speech Recordings with ECAPA embeddings on CommonLanguage
+This repository provides all the necessary tools to perform language identification from speech recordings with SpeechBrain.
 The system uses a model pretrained on the CommonLanguage dataset (45 languages).
 You can download the dataset [here](https://zenodo.org/record/5036977#.YNzDbXVKg5k)
 The provided system can recognize the following 45 languages from short speech recordings:
 ```
+Arabic, Basque, Breton, Catalan, Chinese_China, Chinese_Hongkong, Chinese_Taiwan, Chuvash, Czech, Dhivehi, Dutch, English, Esperanto, Estonian, French, Frisian, Georgian, German, Greek, Hakha_Chin, Indonesian, Interlingua, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Maltese, Mongolian, Persian, Polish, Portuguese, Romanian, Romansh_Sursilvan, Russian, Sakha, Slovenian, Spanish, Swedish, Tamil, Tatar, Turkish, Ukrainian, Welsh
 ```
 For a better experience, we encourage you to learn more about
 ## Pipeline description
+This system is composed of an ECAPA model coupled with statistical pooling. A classifier, trained with Categorical Cross-Entropy Loss, is applied on top of that.
 The system is trained with recordings sampled at 16kHz (single channel).
 The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *classify_file* if needed. Make sure your input tensor is compliant with the expected sampling rate if you use *encode_batch* and *classify_batch*.
 ### Training
 The model was trained with SpeechBrain (a02f860e).
+To train it from scratch follow these steps:
 1. Clone SpeechBrain:
 ```bash
 git clone https://github.com/speechbrain/speechbrain/