Update README.md
Browse files
README.md
CHANGED
@@ -6,43 +6,13 @@ pipeline_tag: automatic-speech-recognition
|
|
6 |
---
|
7 |
Whisper ASR for Kyrgyz Language is an automatic speech recognition (ASR) solution customized for the Kyrgyz language. It is based on the pre-trained Whisper model and has undergone fine-tuning and adaptation to accurately transcribe Kyrgyz speech, taking into account its specific phonetic intricacies.
|
8 |
|
9 |
-
To run the model, first install:
|
10 |
-
```bash
|
11 |
-
!pip install datasets>=2.6.1
|
12 |
-
!pip install git+https://github.com/huggingface/transformers
|
13 |
-
!pip install librosa
|
14 |
-
!pip install evaluate>=0.30
|
15 |
-
!pip install jiwer
|
16 |
-
!pip install gradio==3.50.2
|
17 |
-
```
|
18 |
|
19 |
-
Linking the notebook to the Hub is straightforward - it simply requires entering your Hub authentication token when prompted.
|
20 |
-
```python
|
21 |
-
from huggingface_hub import notebook_login
|
22 |
|
23 |
-
notebook_login()
|
24 |
```
|
25 |
-
|
26 |
-
Now that we've fine-tuned our model, we can build a demo to show off its ASR capabilities! We'll use 🤗 Transformers pipeline, which will take care of the entire ASR pipeline, right from pre-processing the audio inputs to decoding the model predictions. We'll build our interactive demo with Gradio. Gradio is arguably the most straightforward way of building machine learning demos; with Gradio, we can build a demo in just a matter of minutes!
|
27 |
-
|
28 |
-
Running the example below will generate a Gradio demo where we can record speech through the microphone of our computer and input it to our fine-tuned Whisper model to transcribe the corresponding text:
|
29 |
-
```python
|
30 |
-
from transformers import pipeline
|
31 |
-
import gradio as gr
|
32 |
-
|
33 |
-
pipe = pipeline(model="UlutSoftLLC/whisper-small-kyrgyz")
|
34 |
|
35 |
def transcribe(audio):
|
36 |
text = pipe(audio)["text"]
|
37 |
return text
|
38 |
|
39 |
-
iface = gr.Interface(
|
40 |
-
fn=transcribe,
|
41 |
-
inputs=gr.Audio(source="microphone", type="filepath"),
|
42 |
-
outputs="text",
|
43 |
-
title="Whisper Small Kyrgyz",
|
44 |
-
description="Realtime demo for Kyrgyz speech recognition using a fine-tuned Whisper small model.",
|
45 |
-
)
|
46 |
-
|
47 |
-
iface.launch()
|
48 |
```
|
|
|
6 |
---
|
7 |
Whisper ASR for Kyrgyz Language is an automatic speech recognition (ASR) solution customized for the Kyrgyz language. It is based on the pre-trained Whisper model and has undergone fine-tuning and adaptation to accurately transcribe Kyrgyz speech, taking into account its specific phonetic intricacies.
|
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
|
|
|
|
|
|
10 |
|
|
|
11 |
```
|
12 |
+
pipe = pipeline(model="the-cramer-project/AkylAI-STT-small")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
def transcribe(audio):
|
15 |
text = pipe(audio)["text"]
|
16 |
return text
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
```
|