Whisper model finetuned using audio data from CommonVoice Ukrainian v10 train and dev set with additional data via semi-supervised data.
There is a differences in tokenization of source data (in our data normalization process, we replace punctucation with ""
rather than Whisper's " "
). This mismatch leads to a slight degradation on CommonVoice.
- Downloads last month
- 24
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- WER on mozilla-foundation/common_voice_11_0test set self-reported13.010