IMDB Decision Tree rating predictor

Problem statement

 

We aim to improve the netflix recommender based on the Top 250 on IMDB. We are looking to see what features publicly available from IMDB help to predict the IMDB rating. Does the actor, director or genre have a significant impact on the rating of the movie? Do public reviews reflect the rating of the movie?

Collecting the data

Using a combination of web scraping using beautiful soup, www.omdbapi.com and IMDB Pie.

First I scraped for the the basic information about the movie from omdbapi, then when to IMDB to collect movie finance information. IMDB Pie was useful for viewer reviews.

Data Munging.

Create dummies based on actor and genre.