fix model card
Browse files
README.md
CHANGED
@@ -20,24 +20,24 @@ model-index:
|
|
20 |
type: mozilla-foundation/common_voice_8_0
|
21 |
args: uz
|
22 |
metrics:
|
23 |
-
- name: Test WER (no LM)
|
24 |
-
type: wer
|
25 |
-
value: 32.88
|
26 |
-
- name: Test CER (no LM)
|
27 |
-
type: cer
|
28 |
-
value: 6.53
|
29 |
- name: Test WER (with LM)
|
30 |
type: wer
|
31 |
value: 15.065
|
32 |
- name: Test CER (with LM)
|
33 |
type: cer
|
34 |
value: 3.077
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
---
|
36 |
|
37 |
# XLS-R-300M Uzbek CV8
|
38 |
|
39 |
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - UZ dataset.
|
40 |
-
It achieves the following results on the
|
41 |
- Loss: 0.3063
|
42 |
- Wer: 0.3852
|
43 |
- Cer: 0.0777
|
@@ -49,6 +49,8 @@ For a description of the model architecture, see [facebook/wav2vec2-xls-r-300m](
|
|
49 |
The model vocabulary consists of the [Modern Latin alphabet for Uzbek](https://en.wikipedia.org/wiki/Uzbek_alphabet), with punctuation removed.
|
50 |
Note that the characters <β> and <β> do not count as punctuation, as <β> modifies \<o\> and \<g\>, and <β> indicates the glottal stop or a long vowel.
|
51 |
|
|
|
|
|
52 |
## Intended uses & limitations
|
53 |
|
54 |
This model is expected to be of some utility for low-fidelity use cases such as:
|
@@ -61,7 +63,7 @@ The model is not reliable enough to use as a substitute for live captions for ac
|
|
61 |
|
62 |
The 50% of the `train` common voice official split was used as training data. The 50% of the official `dev` split was used as validation data, and the full `test` set was used for final evaluation of the model without LM, while the model with LM was evaluated only on 500 examples from the `test` set.
|
63 |
|
64 |
-
The kenlm language model was compiled from the target sentences of the train + other
|
65 |
|
66 |
### Training hyperparameters
|
67 |
|
|
|
20 |
type: mozilla-foundation/common_voice_8_0
|
21 |
args: uz
|
22 |
metrics:
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
- name: Test WER (with LM)
|
24 |
type: wer
|
25 |
value: 15.065
|
26 |
- name: Test CER (with LM)
|
27 |
type: cer
|
28 |
value: 3.077
|
29 |
+
- name: Test WER (no LM)
|
30 |
+
type: wer
|
31 |
+
value: 32.88
|
32 |
+
- name: Test CER (no LM)
|
33 |
+
type: cer
|
34 |
+
value: 6.53
|
35 |
---
|
36 |
|
37 |
# XLS-R-300M Uzbek CV8
|
38 |
|
39 |
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - UZ dataset.
|
40 |
+
It achieves the following results on the validation set:
|
41 |
- Loss: 0.3063
|
42 |
- Wer: 0.3852
|
43 |
- Cer: 0.0777
|
|
|
49 |
The model vocabulary consists of the [Modern Latin alphabet for Uzbek](https://en.wikipedia.org/wiki/Uzbek_alphabet), with punctuation removed.
|
50 |
Note that the characters <β> and <β> do not count as punctuation, as <β> modifies \<o\> and \<g\>, and <β> indicates the glottal stop or a long vowel.
|
51 |
|
52 |
+
The decoder uses a kenlm language model built on common_voice text.
|
53 |
+
|
54 |
## Intended uses & limitations
|
55 |
|
56 |
This model is expected to be of some utility for low-fidelity use cases such as:
|
|
|
63 |
|
64 |
The 50% of the `train` common voice official split was used as training data. The 50% of the official `dev` split was used as validation data, and the full `test` set was used for final evaluation of the model without LM, while the model with LM was evaluated only on 500 examples from the `test` set.
|
65 |
|
66 |
+
The kenlm language model was compiled from the target sentences of the train + other dataset splits.
|
67 |
|
68 |
### Training hyperparameters
|
69 |
|