shalaka-thorat's picture
Create README.md
f19a40c
metadata
language:
  - en
metrics:
  - accuracy
tags:
  - sklearn
  - machine learning
  - movie-genre-prediction
  - multi-class classification

Model Details

Model Description

The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.

The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.

  • Developed by: [Shalaka Thorat]
  • Shared by: [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
  • Language: [Python]
  • Tags: [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]

Model Sources

  • Repository: [competitions/movie-genre-prediction]

Training Details

We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis. The output of the model is a class (out of 10 classes) of the genre.

Training Data

All the Training and Test Data can be found here:

[competitions/movie-genre-prediction]

Preprocessing

  1. Label Encoding
  2. Tokenization
  3. TF-IDF Vectorization
  4. Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data

Evaluation

The evaluation metric used is [Accuracy] as specified in the competition.