3rd Place Solution - Naaive Solution :D
Hi All,
First of all i like to thanks
@huggingface-co
and Data driven science for hosting an amazing competition where we can really learn from different people across the community and grow . So thanks a lot for that , and also a big thanks to
@abhishek
.
The data was super clean and I have concat the movie_name + synopsis , as the only synopsis based model was not helping in generating the Best LB score :D and used stratified 5 fold for the prediction i used the idxmax function to match with the corresponding genre. My solution is super naaive :)
Coming on to my solution :
Things that worked:
- Deberta V3 large + AWP + Mean Pooling+ Cross Entropy Loss : 512 max length
- Deberta Large + AWP + Mean Pooling+ Cross Entropy Loss : 512 max length
- Deberta Xlarge + AWP + Mean Pooling+Cross Entropy loss : 512 max length
and Average of these models result in public LB : 0.44x and private LB: .43x
What don't worked:
- GPT2 medium model , for variety i trained GPT2 medium model but it worsen the score.
- Even tried with pooled output for the forward pass , didn't help much.
- Prepoccesing of the text even worsen the score
I used A6000 for all the training and I'll share the clean code soon.
Thanks once again and Happy Learning
Regards,
Aman Kapoor
Thank you so much for sharing Aman ๐