samitizerxu
commited on
Commit
•
e3f6a2f
1
Parent(s):
4e70173
Added eval results and eval commands
Browse files
README.md
CHANGED
@@ -12,7 +12,35 @@ datasets:
|
|
12 |
- common_voice
|
13 |
model-index:
|
14 |
- name: wav2vec2-xls-r-300m-zh-CN
|
15 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
---
|
17 |
|
18 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -133,3 +161,16 @@ The following hyperparameters were used during training:
|
|
133 |
- Pytorch 1.10.2+cu102
|
134 |
- Datasets 1.18.2.dev0
|
135 |
- Tokenizers 0.11.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
- common_voice
|
13 |
model-index:
|
14 |
- name: wav2vec2-xls-r-300m-zh-CN
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
name: Automatic Speech Recognition
|
18 |
+
type: automatic-speech-recognition
|
19 |
+
dataset:
|
20 |
+
name: Common Voice 7
|
21 |
+
type: mozilla-foundation/common_voice_7_0
|
22 |
+
args: zh-CN
|
23 |
+
metrics:
|
24 |
+
- name: Test WER
|
25 |
+
type: wer
|
26 |
+
value: 80
|
27 |
+
- name: Test CER
|
28 |
+
type: cer
|
29 |
+
value: 40.11
|
30 |
+
# - task:
|
31 |
+
# name: Automatic Speech Recognition
|
32 |
+
# type: automatic-speech-recognition
|
33 |
+
# dataset:
|
34 |
+
# name: Robust Speech Event - Dev Data
|
35 |
+
# type: speech-recognition-community-v2/dev_data
|
36 |
+
# args: zh-CN
|
37 |
+
# metrics:
|
38 |
+
# - name: Test WER
|
39 |
+
# type: wer
|
40 |
+
# value: TBD
|
41 |
+
# - name: Test CER
|
42 |
+
# type: cer
|
43 |
+
# value: TBD
|
44 |
---
|
45 |
|
46 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
161 |
- Pytorch 1.10.2+cu102
|
162 |
- Datasets 1.18.2.dev0
|
163 |
- Tokenizers 0.11.0
|
164 |
+
|
165 |
+
#### Evaluation Commands
|
166 |
+
1. To evaluate on `mozilla-foundation/common_voice_7_0` with split `test`
|
167 |
+
|
168 |
+
```bash
|
169 |
+
python eval.py --model_id samitizerxu/wav2vec2-xls-r-300m-zh-CN --dataset mozilla-foundation/common_voice_7_0 --config zh-CN --split test
|
170 |
+
```
|
171 |
+
|
172 |
+
2. To evaluate on `speech-recognition-community-v2/dev_data`
|
173 |
+
|
174 |
+
```bash
|
175 |
+
python eval.py --model_id samitizerxu/wav2vec2-xls-r-300m-zh-CN --dataset speech-recognition-community-v2/dev_data --config zh-CN --split validation --chunk_length_s 5.0 --stride_length_s 1.0
|
176 |
+
```
|
log_mozilla-foundation_common_voice_7_0_zh-CN_test_predictions.txt
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
0
|
2 |
+
赠桥母亲往外判头。
|
3 |
+
1
|
4 |
+
这经为指原气火券总共发行了两仓酸辑。
|
5 |
+
2
|
6 |
+
是月利降大十期年以来的新低点。
|
7 |
+
3
|
8 |
+
庙中丙而有营联有在下列日子实节庙语会举行继次以庆筑。
|
9 |
+
4
|
10 |
+
科斯角维其的母亲是金济学学水模士父亲是一名工程师。
|
11 |
+
5
|
12 |
+
正状表现驻入发烧帕风半随头通深藤通打本替或可色。
|
13 |
+
6
|
14 |
+
强化大种运属席同工的。
|
15 |
+
7
|
16 |
+
罗家伦夫人。
|
17 |
+
8
|
18 |
+
给谁我没有一件好
|
19 |
+
9
|
20 |
+
富尔顿是一个位于美国阿肯色州亨普斯特德县的市镇。
|
log_mozilla-foundation_common_voice_7_0_zh-CN_test_targets.txt
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
0
|
2 |
+
正巧母亲往外探头
|
3 |
+
1
|
4 |
+
至今为止,元气火箭总共发行了两张专辑。
|
5 |
+
2
|
6 |
+
失业率降到十七年来的新低点
|
7 |
+
3
|
8 |
+
庙中匾额有:庙宇楹联有:在下列日子、时节,庙宇会举行祀仪庆祝
|
9 |
+
4
|
10 |
+
科斯捷维奇的母亲是经济学哲学博士,父亲是一名工程师。
|
11 |
+
5
|
12 |
+
症状表现诸如发烧、怕风,伴随头痛、身疼痛、打喷嚏或咳嗽。
|
13 |
+
6
|
14 |
+
强化大众运输系统功能
|
15 |
+
7
|
16 |
+
罗家伦夫人。
|
17 |
+
8
|
18 |
+
给谁我没有意见啊
|
19 |
+
9
|
20 |
+
富尔顿是一个位于美国阿肯色州亨普斯特德县的市镇。
|
mozilla-foundation_common_voice_7_0_zh-CN_test_eval_results.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
WER: 0.8
|
2 |
+
CER: 0.4011627906976744
|