metadata
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- whisper-event
- generated_from_trainer
datasets:
- facebook/voxpopuli
metrics:
- wer
model-index:
- name: WhisperForSpokenNER
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: facebook/voxpopuli de+es+fr+nl
type: facebook/voxpopuli
config: de+es+fr+nl
split: None
metrics:
- name: Wer
type: wer
value: 0.06196300023221612
WhisperForSpokenNER
This model is a fine-tuned version of openai/whisper-large-v2 on the facebook/voxpopuli de+es+fr+nl dataset. It achieves the following results on the evaluation set:
- Loss: 0.2797
- F1 Score: 0.7918
- Label F1: 0.8933
- Wer: 0.0620
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 500
- training_steps: 5000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Label F1 | Wer |
---|---|---|---|---|---|---|
0.1748 | 0.36 | 200 | 0.1706 | 0.6541 | 0.8032 | 0.0656 |
0.1754 | 0.71 | 400 | 0.1769 | 0.7194 | 0.8502 | 0.0674 |
0.1606 | 1.07 | 600 | 0.1856 | 0.6991 | 0.8407 | 0.0708 |
0.1282 | 1.43 | 800 | 0.1835 | 0.7455 | 0.8724 | 0.0728 |
0.131 | 1.79 | 1000 | 0.1762 | 0.7331 | 0.8691 | 0.0713 |
0.0804 | 2.14 | 1200 | 0.1792 | 0.7544 | 0.8744 | 0.0685 |
0.0712 | 2.5 | 1400 | 0.1833 | 0.75 | 0.8846 | 0.0691 |
0.0746 | 2.86 | 1600 | 0.1800 | 0.7554 | 0.8732 | 0.0738 |
0.0331 | 3.22 | 1800 | 0.1992 | 0.7757 | 0.8804 | 0.0702 |
0.0363 | 3.57 | 2000 | 0.1938 | 0.7625 | 0.8805 | 0.0688 |
0.037 | 3.93 | 2200 | 0.1986 | 0.7771 | 0.8865 | 0.0677 |
0.0153 | 4.29 | 2400 | 0.2125 | 0.7765 | 0.8794 | 0.0666 |
0.0144 | 4.65 | 2600 | 0.2115 | 0.7763 | 0.8922 | 0.0681 |
0.0148 | 5.0 | 2800 | 0.2180 | 0.7781 | 0.8891 | 0.0647 |
0.0058 | 5.36 | 3000 | 0.2310 | 0.7918 | 0.8913 | 0.0629 |
0.0058 | 5.72 | 3200 | 0.2268 | 0.7828 | 0.8938 | 0.0627 |
0.0036 | 6.08 | 3400 | 0.2462 | 0.7911 | 0.8937 | 0.0621 |
0.0019 | 6.43 | 3600 | 0.2493 | 0.7948 | 0.8950 | 0.0629 |
0.0016 | 6.79 | 3800 | 0.2543 | 0.7917 | 0.8980 | 0.0631 |
0.0009 | 7.15 | 4000 | 0.2667 | 0.7880 | 0.8944 | 0.0619 |
0.0007 | 7.51 | 4200 | 0.2735 | 0.7909 | 0.8934 | 0.0624 |
0.0007 | 7.86 | 4400 | 0.2756 | 0.7901 | 0.8926 | 0.0621 |
0.0005 | 8.22 | 4600 | 0.2779 | 0.7913 | 0.8931 | 0.0624 |
0.0004 | 8.58 | 4800 | 0.2795 | 0.7909 | 0.8932 | 0.0620 |
0.0005 | 8.94 | 5000 | 0.2797 | 0.7918 | 0.8933 | 0.0620 |
Framework versions
- Transformers 4.37.0.dev0
- Pytorch 2.1.0
- Datasets 2.14.6
- Tokenizers 0.14.1