Ahmed107 commited on
Commit
5ce9200
1 Parent(s): 888d6dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -13,18 +13,18 @@ model-index:
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- # hamsa-medium
17
 
18
  This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) using [ARBML](https://github.com/ARBML/whisperar) on the [**nadsoft/Jordan-Audio dataset.**](https://huggingface.co/datasets/nadsoft/Jordan-Audio)
19
 
20
  ## Model description
21
 
22
- Hamsa is a Whisper Medium model that has been fine-tuned using the ARBML method in this repository. We have also added some Jordanian data to the model to adapt it to a few shot learning tasks.
23
-
24
  ## Intended uses & limitations
25
 
26
- More information needed
27
 
 
28
  ## Training and evaluation data
29
 
30
  nadsoft/Jordan-Audio
@@ -33,3 +33,12 @@ More information needed
33
 
34
  ### Training hyperparameters
35
 
 
 
 
 
 
 
 
 
 
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ # **hamsa-v0.1-beta**
17
 
18
  This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) using [ARBML](https://github.com/ARBML/whisperar) on the [**nadsoft/Jordan-Audio dataset.**](https://huggingface.co/datasets/nadsoft/Jordan-Audio)
19
 
20
  ## Model description
21
 
22
+ Hamsa (همسة) represents a sophisticated advancement in the realm of Arabic speech recognition. It's a pre-trained automatic speech recognition (ASR) model that is built upon the foundation of the Whisper model. Hamsa is not just a technological achievement; it's a testament to NADSOFT's commitment to elevating the standards of AI results for the Arabic language. This contribution is especially significant for the Middle East and North Africa (MENA) region and the broader Arab World, as it seeks to address the unique linguistic nuances and cater to the specific needs of these communities.
 
23
  ## Intended uses & limitations
24
 
25
+ Hamsa is a model that is still under development, and it is important to be aware of its limitations. For example, the model may not be able to accurately transcribe text from speakers with very strong accents, such as Moroccan Arabic. Additionally, the model may have difficulty transcribing text from noisy recordings.
26
 
27
+ It is important to note that Hamsa is not a perfect model, and it should not be used to generate text that is intended to be used in legal, medical, or other sensitive contexts.
28
  ## Training and evaluation data
29
 
30
  nadsoft/Jordan-Audio
 
33
 
34
  ### Training hyperparameters
35
 
36
+ - learning_rate: 1e-05
37
+ - train_batch_size: 32
38
+ - eval_batch_size: 16
39
+ - seed: 42
40
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
+ - lr_scheduler_type: linear
42
+ - lr_scheduler_warmup_steps: 500
43
+ - training_steps: 10000 then 4000 for NADSOFT data
44
+ - mixed_precision_training: Native AMP