Update README.md
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ img {
|
|
31 |
| [![Riva Compatible](https://img.shields.io/badge/NVIDIA%20Riva-compatible-brightgreen#model-badge)](#deployment-with-nvidia-riva) |
|
32 |
|
33 |
This model transcribes speech in lowercase Ukrainian alphabet including spaces and apostrophes, and is trained on 69 hours of Ukrainian speech data.
|
34 |
-
It is a non-autoregressive "large" variant of Streaming Citrinet, with around 141 million parameters.
|
35 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-ctc) for complete architecture details.
|
36 |
It is also compatible with NVIDIA Riva for [production-grade server deployments](#deployment-with-nvidia-riva).
|
37 |
|
@@ -88,7 +88,7 @@ The tokenizer for this models was built using the text transcripts of the train
|
|
88 |
|
89 |
### Datasets
|
90 |
|
91 |
-
Model is trained on Mozilla Common Voice Corpus 10.0 dataset comprising of 69 hours of Ukrainian speech.
|
92 |
|
93 |
## Limitations
|
94 |
|
@@ -107,4 +107,5 @@ Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
|
107 |
|
108 |
[1] [Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition](https://arxiv.org/abs/2104.01721) <br />
|
109 |
[2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece) <br />
|
110 |
-
[3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
|
|
|
31 |
| [![Riva Compatible](https://img.shields.io/badge/NVIDIA%20Riva-compatible-brightgreen#model-badge)](#deployment-with-nvidia-riva) |
|
32 |
|
33 |
This model transcribes speech in lowercase Ukrainian alphabet including spaces and apostrophes, and is trained on 69 hours of Ukrainian speech data.
|
34 |
+
It is a non-autoregressive "large" variant of Streaming Citrinet, with around 141 million parameters. Model is fine-tuned with pre-trained Russian Citrinet-1024 model on Ukrainian speech data using Cross-Language Transfer Learning [4] approach.
|
35 |
See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-ctc) for complete architecture details.
|
36 |
It is also compatible with NVIDIA Riva for [production-grade server deployments](#deployment-with-nvidia-riva).
|
37 |
|
|
|
88 |
|
89 |
### Datasets
|
90 |
|
91 |
+
Model is trained on validated Mozilla Common Voice Corpus 10.0 dataset(excluding dev and test data) comprising of 69 hours of Ukrainian speech.
|
92 |
|
93 |
## Limitations
|
94 |
|
|
|
107 |
|
108 |
[1] [Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition](https://arxiv.org/abs/2104.01721) <br />
|
109 |
[2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece) <br />
|
110 |
+
[3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo) <br />
|
111 |
+
[4] [Cross-Language Transfer Learning](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=qmmIGnwAAAAJ&sortby=pubdate&citation_for_view=qmmIGnwAAAAJ:PVjk1bu6vJQC)
|