Update README.md
Browse files
README.md
CHANGED
@@ -56,18 +56,18 @@ model-index:
|
|
56 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
57 |
should probably proofread and complete it, then remove this comment. -->
|
58 |
|
59 |
-
#
|
60 |
|
61 |
-
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the [Common Voice 8.0 - Romanian subset](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0) dataset, with extra training data from [Romanian Speech Synthesis](https://huggingface.co/datasets/gigant/romanian_speech_synthesis_0_8_1) dataset.
|
62 |
|
63 |
-
|
64 |
- Loss: 0.1553
|
65 |
- Wer: 0.1174
|
66 |
- Cer: 0.0294
|
67 |
|
68 |
## Model description
|
69 |
|
70 |
-
|
71 |
|
72 |
## Intended uses & limitations
|
73 |
|
@@ -75,7 +75,12 @@ More information needed
|
|
75 |
|
76 |
## Training and evaluation data
|
77 |
|
78 |
-
|
|
|
|
|
|
|
|
|
|
|
79 |
|
80 |
## Training procedure
|
81 |
|
|
|
56 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
57 |
should probably proofread and complete it, then remove this comment. -->
|
58 |
|
59 |
+
# Romanian Wav2Vec2
|
60 |
|
61 |
+
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the [Common Voice 8.0 - Romanian subset](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0) dataset (train + validation + other splits), with extra training data from [Romanian Speech Synthesis](https://huggingface.co/datasets/gigant/romanian_speech_synthesis_0_8_1) dataset (train + test splits).
|
62 |
|
63 |
+
Without the 5-gram Language Model optimization, it achieves the following results on the evaluation set (Common Voice 8.0, Romanian subset, test split):
|
64 |
- Loss: 0.1553
|
65 |
- Wer: 0.1174
|
66 |
- Cer: 0.0294
|
67 |
|
68 |
## Model description
|
69 |
|
70 |
+
The architecture is based on [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) with a speech recognition CTC head and an added 5-gram language model (using [pyctcdecode](https://github.com/kensho-technologies/pyctcdecode) and [kenlm](https://github.com/kpu/kenlm)). Those libraries are needed in order for the language model-boosted decoder to work.
|
71 |
|
72 |
## Intended uses & limitations
|
73 |
|
|
|
75 |
|
76 |
## Training and evaluation data
|
77 |
|
78 |
+
Training data :
|
79 |
+
- [Common Voice 8.0 - Romanian subset](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0) : train + validation + other splits
|
80 |
+
- [Romanian Speech Synthesis](https://huggingface.co/datasets/gigant/romanian_speech_synthesis_0_8_1) : train + test splits
|
81 |
+
|
82 |
+
Evaluation data :
|
83 |
+
- [Common Voice 8.0 - Romanian subset](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0) : test split
|
84 |
|
85 |
## Training procedure
|
86 |
|