|
--- |
|
base_model: facebook/w2v-bert-2.0 |
|
language: |
|
- uk |
|
license: "apache-2.0" |
|
tags: |
|
- automatic-speech-recognition |
|
datasets: |
|
- espnet/yodas2 |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: w2v-bert-uk-v2.1 |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: common_voice_10_0 |
|
type: common_voice_10_0 |
|
config: uk |
|
split: test |
|
args: uk |
|
metrics: |
|
- name: Wer |
|
type: wer |
|
value: 0.0000 |
|
--- |
|
|
|
# w2v-bert-uk `v2.1` |
|
|
|
|
|
## Community |
|
|
|
- **Discord**: https://discord.gg/yVAjkBgmt4 |
|
- Speech Recognition: https://t.me/speech_recognition_uk |
|
- Speech Synthesis: https://t.me/speech_synthesis_uk |
|
|
|
## Overview |
|
|
|
This is a next model of https://huggingface.co/Yehor/w2v-bert-uk |
|
|
|
## Demo |
|
|
|
Use https://huggingface.co/spaces/Yehor/w2v-bert-uk-v2.1-demo space to see how the model works with your audios. |
|
|
|
## Usage |
|
|
|
```python |
|
# pip install -U torch soundfile transformers |
|
|
|
import torch |
|
import soundfile as sf |
|
from transformers import AutoModelForCTC, Wav2Vec2BertProcessor |
|
|
|
# Config |
|
model_name = 'Yehor/w2v-bert-2.0-uk-v2.1' |
|
device = 'cuda:1' # or cpu |
|
sampling_rate = 16_000 |
|
|
|
# Load the model |
|
asr_model = AutoModelForCTC.from_pretrained(model_name).to(device) |
|
processor = Wav2Vec2BertProcessor.from_pretrained(model_name) |
|
|
|
paths = [ |
|
'sample1.wav', |
|
] |
|
|
|
# Extract audio |
|
audio_inputs = [] |
|
for path in paths: |
|
audio_input, _ = sf.read(path) |
|
audio_inputs.append(audio_input) |
|
|
|
# Transcribe the audio |
|
inputs = processor(audio_inputs, sampling_rate=sampling_rate).input_features |
|
features = torch.tensor(inputs).to(device) |
|
|
|
with torch.inference_mode(): |
|
logits = asr_model(features).logits |
|
|
|
predicted_ids = torch.argmax(logits, dim=-1) |
|
predictions = processor.batch_decode(predicted_ids) |
|
|
|
# Log results |
|
print('Predictions:') |
|
print(predictions) |
|
``` |
|
|