added updates

Browse files

Files changed (5) hide show

README.md +75 -0
VALID_shona_sna_audio_data.csv +7 -0
afrospeech-wav2vec-sna_METRICS_VALID.json +1 -0
afrospeech-wav2vec-sna_confusion_matrix_VALID.png +0 -0
digits-bar-plot-for-afrospeech-wav2vec-sna.png +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+license: apache-2.0
+tags:
+- afro-digits-speech
+datasets:
+- crowd-speech-africa
+metrics:
+- accuracy
+model-index:
+- name: afrospeech-wav2vec-sna
+  results:
+  - task:
+      name: Audio Classification
+      type: audio-classification
+    dataset:
+      name: Afro Speech
+      type: chrisjay/crowd-speech-africa
+      args: no
+    metrics:
+       - name: Validation Accuracy
+         type: accuracy
+         value: 1.0
+---
+# afrospeech-wav2vec-sna
+This model is a fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base) on the [crowd-speech-africa](https://huggingface.co/datasets/chrisjay/crowd-speech-africa), which was a crowd-sourced dataset collected using the [afro-speech Space](https://huggingface.co/spaces/chrisjay/afro-speech). It achieves the following results on the [validation set](VALID_shona_sna_audio_data.csv):
+- F1: 1.0
+- Accuracy: 1.0
+The confusion matrix below helps to give a better look at the model's performance across the digits. Through it, we can see the precision and recall of the model as well as other important insights.
+![confusion matrix](afrospeech-wav2vec-sna_confusion_matrix_VALID.png)
+## Training and evaluation data
+The model was trained on a mixed audio data from Shona (`sna`).
+- Size of training set: 24
+- Size of validation set: 6
+Below is a distribution of the dataset (training and valdation)
+![digits-bar-plot-for-afrospeech](digits-bar-plot-for-afrospeech-wav2vec-sna.png)
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- num_epochs: 150
+### Training results
+| Training Loss | Epoch |  Validation Accuracy |
+|:-------------:|:-----:|:--------:|
+| 0.02387        | 1    | 1.0 |
+| 0.0021066       | 50   | 1.0 |
+| 0.001157       | 100   | 1.0  |
+| 0.0009537       | 150   | 1.0  |
+### Framework versions
+- Transformers 4.21.3
+- Pytorch 1.12.0
+- Datasets 1.14.0
+- Tokenizers 0.12.1

VALID_shona_sna_audio_data.csv ADDED Viewed

	@@ -0,0 +1,7 @@

+audio_path,transcript,lang,lang_code,gender,age,country,accent
+AUDIO_HOMEPATH/data/R6W5zU8ezS1V76stFCCwXZPPbwxhltrJ/audio.wav,0,shona,sna,Male,34.0,Australia,Shona
+AUDIO_HOMEPATH/data/8wO2rBjlXFkaBU11BhFXZ8JEgTIY0USA/audio.wav,4,shona,sna,Male,23.0,Zimbabwe,
+AUDIO_HOMEPATH/data/3Ojig6rJkV2UvnRrqCpF8CWxQUaVlojm/audio.wav,2,shona,sna,Male,34.0,Australia,Shona
+AUDIO_HOMEPATH/data/ImBUzQW22uPvBx46BR3gkc6iqC7NzRvw/audio.wav,5,shona,sna,Male,34.0,Australia,Shona
+AUDIO_HOMEPATH/data/vudM0Q3QhYQRUSxnWwZLQJPFSpL1S9rk/audio.wav,7,shona,sna,Male,34.0,Australia,Shona
+AUDIO_HOMEPATH/data/M2G2KFjKpMSEvijN6txPxeDm4UlkbMqr/audio.wav,2,shona,sna,Male,23.0,Zimbabwe,

afrospeech-wav2vec-sna_METRICS_VALID.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"acc": 1.0, "f1": 1.0}

afrospeech-wav2vec-sna_confusion_matrix_VALID.png ADDED Viewed

digits-bar-plot-for-afrospeech-wav2vec-sna.png ADDED Viewed