---
library_name: transformers
license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: speech-emotion-recognition-with-facebook-wav2vec2-large-xlsr-53
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# speech-emotion-recognition-with-facebook-wav2vec2-large-xlsr-53

This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4989
- Accuracy: 0.9168
- Precision: 0.9209
- Recall: 0.9168
- F1: 0.9166

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 5
- total_train_batch_size: 10
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 25
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step | Validation Loss | Accuracy | Precision | Recall | F1     |
|:-------------:|:-------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
| 1.9343        | 0.9995  | 394  | 1.9277          | 0.2505   | 0.1425    | 0.2505 | 0.1691 |
| 1.7944        | 1.9990  | 788  | 1.6446          | 0.4574   | 0.5759    | 0.4574 | 0.4213 |
| 1.4601        | 2.9985  | 1182 | 1.3242          | 0.5953   | 0.6183    | 0.5953 | 0.5709 |
| 1.0551        | 3.9980  | 1576 | 1.0764          | 0.6623   | 0.6659    | 0.6623 | 0.6447 |
| 0.8934        | 5.0     | 1971 | 0.9209          | 0.7059   | 0.7172    | 0.7059 | 0.6825 |
| 1.1156        | 5.9995  | 2365 | 0.8292          | 0.7465   | 0.7635    | 0.7465 | 0.7442 |
| 0.6307        | 6.9990  | 2759 | 0.6439          | 0.8043   | 0.8090    | 0.8043 | 0.8020 |
| 0.774         | 7.9985  | 3153 | 0.6666          | 0.7921   | 0.8117    | 0.7921 | 0.7916 |
| 0.5537        | 8.9980  | 3547 | 0.5111          | 0.8245   | 0.8268    | 0.8245 | 0.8205 |
| 0.3762        | 10.0    | 3942 | 0.5506          | 0.8306   | 0.8390    | 0.8306 | 0.8296 |
| 0.716         | 10.9995 | 4336 | 0.5499          | 0.8276   | 0.8465    | 0.8276 | 0.8268 |
| 0.5372        | 11.9990 | 4730 | 0.5463          | 0.8377   | 0.8606    | 0.8377 | 0.8404 |
| 0.3746        | 12.9985 | 5124 | 0.4758          | 0.8611   | 0.8714    | 0.8611 | 0.8597 |
| 0.4317        | 13.9980 | 5518 | 0.4438          | 0.8742   | 0.8843    | 0.8742 | 0.8756 |
| 0.2104        | 15.0    | 5913 | 0.4426          | 0.8803   | 0.8864    | 0.8803 | 0.8806 |
| 0.3193        | 15.9995 | 6307 | 0.4741          | 0.8671   | 0.8751    | 0.8671 | 0.8683 |
| 0.3445        | 16.9990 | 6701 | 0.3850          | 0.9037   | 0.9047    | 0.9037 | 0.9038 |
| 0.2777        | 17.9985 | 7095 | 0.4802          | 0.8834   | 0.8923    | 0.8834 | 0.8836 |
| 0.4406        | 18.9980 | 7489 | 0.4053          | 0.9047   | 0.9096    | 0.9047 | 0.9043 |
| 0.1707        | 20.0    | 7884 | 0.4434          | 0.9067   | 0.9129    | 0.9067 | 0.9069 |
| 0.2138        | 20.9995 | 8278 | 0.5051          | 0.9037   | 0.9155    | 0.9037 | 0.9053 |
| 0.1812        | 21.9990 | 8672 | 0.4238          | 0.8955   | 0.9007    | 0.8955 | 0.8953 |
| 0.3639        | 22.9985 | 9066 | 0.4021          | 0.9138   | 0.9182    | 0.9138 | 0.9143 |
| 0.3193        | 23.9980 | 9460 | 0.4989          | 0.9168   | 0.9209    | 0.9168 | 0.9166 |
| 0.2067        | 24.9873 | 9850 | 0.4959          | 0.8976   | 0.9032    | 0.8976 | 0.8975 |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.0
- Tokenizers 0.19.1