wav2vec2-xls-r-300m-mixed
Finetuned https://huggingface.co/facebook/wav2vec2-xls-r-300m on https://github.com/huseinzol05/malaya-speech/tree/master/data/mixed-stt
This model was finetuned on 3 languages,
- Malay
- Singlish
- Mandarin
This model trained on a single RTX 3090 Ti 24GB VRAM, provided by https://mesolitica.com/.
Evaluation set
Evaluation set from https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt with sizes,
len(malay), len(singlish), len(mandarin)
-> (765, 3579, 614)
It achieves the following results on the evaluation set based on evaluate-gpu.ipynb:
Mixed evaluation,
CER: 0.0481054244857041
WER: 0.1322198446007387
CER with LM: 0.041196586938584696
WER with LM: 0.09880169127621556
Malay evaluation,
CER: 0.051636391937588406
WER: 0.19561999547293663
CER with LM: 0.03917689630621449
WER with LM: 0.12710746406824835
Singlish evaluation,
CER: 0.0494915200071987
WER: 0.12763802881676573
CER with LM: 0.04271234986432335
WER with LM: 0.09677160640413336
Mandarin evaluation,
CER: 0.035626554824269824
WER: 0.07993515937860181
CER with LM: 0.03487760945087219
WER with LM: 0.07536807168546154
Language model from https://huggingface.co/huseinzol05/language-model-bahasa-manglish-combined
- Downloads last month
- 496
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.