sanchit-gandhi/whisper-small-hi · Reproducing Hindi fine-tuning

Jan 30, 2023

Hey @sanchit-gandhi ,
i tried to reproduce your hindi fine-tuning from https://huggingface.co/blog/fine-tune-whisper with the exact same code as you, but somehow i don't get the same performance.
I'm not sure what causes the different results? maybe you can lead me in the right direction? the link below contains the script i'm running:

https://pastebin.com/wz4dyiEA

My results:

Training Loss	Epoch	Step	Validation Loss	Wer
0.0885	2.44	1000	0.2941	34.8684
0.0222	4.89	2000	0.3481	37.2894
0.0023	7.33	3000	0.4163	61.7286
0.0004	9.78	4000	0.4440	79.7511
0.0002	12.22	5000	0.4595	82.5277

Transformers 4.26.0.dev0
Pytorch 1.13.1+cu117
Datasets 2.8.0
Tokenizer 0.13.2

sanchit-gandhi

Owner Feb 10, 2023

Hey @5amuel ! Sorry for the late reply! That's super weird, it should be possible to get identical results if you run the examples script start to finish. Let me re-run training and get back to you with my results

sanchit-gandhi

Owner Feb 23, 2023

Hey @5amuel ! I re-ran training with transformers installed from main and the default training arguments: https://wandb.ai/sanchit-gandhi/huggingface/runs/7ojjc1py/overview?workspace=user-sanchit-gandhi

You can see from these logs that the results I got were identical to those in the blog post:
https://wandb.ai/sanchit-gandhi/huggingface/runs/7ojjc1py?workspace=

=> this suggests to me everything is in order!