metadata
language:
- en
metrics:
- accuracy
tags:
- sklearn
- machine learning
- movie-genre-prediction
- multi-class classification
Model Details
Model Description
The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.
The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.
- Developed by: [Shalaka Thorat]
- Shared by: [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
- Language: [Python]
- Tags: [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]
Model Sources
- Repository: [competitions/movie-genre-prediction]
Training Details
We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis. The output of the model is a class (out of 10 classes) of the genre.
Training Data
All the Training and Test Data can be found here:
[competitions/movie-genre-prediction]
Preprocessing
- Label Encoding
- Tokenization
- TF-IDF Vectorization
- Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data
Evaluation
The evaluation metric used is [Accuracy] as specified in the competition.