multilingual_speech_to_intent_wav2vec

This model is a fine-tuned version of facebook/wav2vec2-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
2.3588	1.0	219	1.4144	0.5916	0.6385	0.5916	0.5322
0.8825	2.0	438	0.7289	0.8195	0.8635	0.8195	0.8243
0.7836	3.0	657	0.6739	0.8514	0.8648	0.8514	0.8513
0.7345	4.0	876	0.4483	0.9080	0.9189	0.9080	0.9071
0.7204	5.0	1095	0.5039	0.8882	0.9059	0.8882	0.8915
0.5355	6.0	1314	0.5051	0.8967	0.9049	0.8967	0.8971
0.5939	7.0	1533	0.3162	0.9314	0.9387	0.9314	0.9322
0.5311	8.0	1752	0.3218	0.9292	0.9318	0.9292	0.9292
0.5098	9.0	1971	0.5819	0.8804	0.8858	0.8804	0.8809
0.508	10.0	2190	0.5930	0.8804	0.8843	0.8804	0.8792
0.4672	11.0	2409	0.3127	0.9229	0.9251	0.9229	0.9222
0.4619	12.0	2628	0.3761	0.9193	0.9227	0.9193	0.9193
0.4668	13.0	2847	0.6386	0.8740	0.8800	0.8740	0.8726
0.444	14.0	3066	0.4134	0.9073	0.9133	0.9073	0.9079
0.4059	15.0	3285	0.3106	0.9349	0.9370	0.9349	0.9347
0.3857	16.0	3504	0.3639	0.9222	0.9296	0.9222	0.9217
0.432	17.0	3723	0.5168	0.8896	0.8977	0.8896	0.8885
0.3909	18.0	3942	1.0967	0.8004	0.8269	0.8004	0.8022
0.4341	19.0	4161	0.7655	0.8556	0.8624	0.8556	0.8554
0.3673	20.0	4380	0.2394	0.9505	0.9525	0.9505	0.9505
0.3784	21.0	4599	0.4200	0.9207	0.9228	0.9207	0.9202
0.4064	22.0	4818	0.5932	0.8818	0.8876	0.8818	0.8820
0.3825	23.0	5037	0.9998	0.8493	0.8616	0.8493	0.8484
0.3485	24.0	5256	1.1882	0.7877	0.8071	0.7877	0.7888
0.3242	25.0	5475	0.5562	0.9073	0.9118	0.9073	0.9076
0.3526	26.0	5694	0.6743	0.8832	0.8927	0.8832	0.8825
0.3573	27.0	5913	0.3483	0.9271	0.9313	0.9271	0.9272
0.3381	28.0	6132	1.1346	0.8018	0.8152	0.8018	0.8017
0.3243	29.0	6351	0.9003	0.8316	0.8439	0.8316	0.8315
0.3045	30.0	6570	0.9181	0.8493	0.8570	0.8493	0.8482