In October 2015, a data journalist named Walt Hickey analyzed movie ratings in Fandango - a website that sells movie tickets and allows users to score-review movies. Walt Hickey found that Fandango's reviews that are shown to users are far from fair (the details of his analysis can be found on this article):
Fandango's review scale consists on a 0-5 star score, with half-star steps: users can score a movie using this scale, and they'll be shown the score of a movie when looking its review. However, Walt Hickey found that Fandango's presented score is always rounded up to the nearest half-star! Therefore, a movie with an average score of 4.1 would be rounded up to 4.5 stars, instead of the fairer 4.0 stars.
The following plot shows how the displayed score distribution is clearly displaced to the right compared to the actual score distribution:
The final response from Fandango was that this issue was caused by a bug, and that they would fix it as soon as possible.
The goal of this project is to evaluate whether Fandango has corrected his system to present movie scores.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Reading the data
scores15 = pd.read_csv("fandango_score_comparison.csv")
scores1617 = pd.read_csv("movie_ratings_16_17.csv")
# Showing basic info
display("Scores from 2015 shape and descriptives:", scores15.shape, scores15.columns)
display("Scores from 2016/17 shape and descriptives:", scores1617.shape, scores1617.columns)
'Scores from 2015 shape and descriptives:'
(146, 22)
Index(['FILM', 'RottenTomatoes', 'RottenTomatoes_User', 'Metacritic', 'Metacritic_User', 'IMDB', 'Fandango_Stars', 'Fandango_Ratingvalue', 'RT_norm', 'RT_user_norm', 'Metacritic_norm', 'Metacritic_user_nom', 'IMDB_norm', 'RT_norm_round', 'RT_user_norm_round', 'Metacritic_norm_round', 'Metacritic_user_norm_round', 'IMDB_norm_round', 'Metacritic_user_vote_count', 'IMDB_user_vote_count', 'Fandango_votes', 'Fandango_Difference'], dtype='object')
'Scores from 2016/17 shape and descriptives:'
(214, 15)
Index(['movie', 'year', 'metascore', 'imdb', 'tmeter', 'audience', 'fandango', 'n_metascore', 'n_imdb', 'n_tmeter', 'n_audience', 'nr_metascore', 'nr_imdb', 'nr_tmeter', 'nr_audience'], dtype='object')
As we're not interested in every column from these datasets, we'll keep only those related to Fandango's website. This is, we'll keep from the original 2015 dataset the following columns:
Then, from the 2016/16 dataset we'll keep the following columns:
# Keeping the columns we need
fandango15 = scores15[["FILM", "Fandango_Stars", "Fandango_Ratingvalue", "Fandango_votes", "Fandango_Difference"]].copy()
fandango1617 = scores1617[["movie", "year", "fandango"]]
# Showing basic info
display("Scores from Fandango 2015 shape and descriptives:", fandango15.shape, fandango15.describe(), fandango15.head())
display("Scores from Fandango 2016/17 shape and descriptives:", fandango1617.shape, fandango1617.describe(), fandango1617.head())
'Scores from Fandango 2015 shape and descriptives:'
(146, 5)
Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|
count | 146.000000 | 146.000000 | 146.000000 | 146.000000 |
mean | 4.089041 | 3.845205 | 3848.787671 | 0.243836 |
std | 0.540386 | 0.502831 | 6357.778617 | 0.152665 |
min | 3.000000 | 2.700000 | 35.000000 | 0.000000 |
25% | 3.500000 | 3.500000 | 222.250000 | 0.100000 |
50% | 4.000000 | 3.900000 | 1446.000000 | 0.200000 |
75% | 4.500000 | 4.200000 | 4439.500000 | 0.400000 |
max | 5.000000 | 4.800000 | 34846.000000 | 0.500000 |
FILM | Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 5.0 | 4.5 | 14846 | 0.5 |
1 | Cinderella (2015) | 5.0 | 4.5 | 12640 | 0.5 |
2 | Ant-Man (2015) | 5.0 | 4.5 | 12055 | 0.5 |
3 | Do You Believe? (2015) | 5.0 | 4.5 | 1793 | 0.5 |
4 | Hot Tub Time Machine 2 (2015) | 3.5 | 3.0 | 1021 | 0.5 |
'Scores from Fandango 2016/17 shape and descriptives:'
(214, 3)
year | fandango | |
---|---|---|
count | 214.000000 | 214.000000 |
mean | 2016.107477 | 3.894860 |
std | 0.310444 | 0.516781 |
min | 2016.000000 | 2.500000 |
25% | 2016.000000 | 3.500000 |
50% | 2016.000000 | 4.000000 |
75% | 2016.000000 | 4.500000 |
max | 2017.000000 | 5.000000 |
movie | year | fandango | |
---|---|---|---|
0 | 10 Cloverfield Lane | 2016 | 3.5 |
1 | 13 Hours | 2016 | 4.5 |
2 | A Cure for Wellness | 2016 | 3.0 |
3 | A Dog's Purpose | 2017 | 4.5 |
4 | A Hologram for the King | 2016 | 3.0 |
Reading the README.md files from each dataset, we see how each dataset was obtained:
This is, the 2016/17 data set might not be the best dataset, as it is only including the most popular movies, and also the movies from 2017 could significantly change their reviews if properly updated.
Therefore, we'll update our goal so we'll check whether Fandango's system rating had improved by looking at most popular movies from 2016.
The term "popular" is vague and we need to define it with precision before continuing. We'll use Hickey's benchmark of 30 fan ratings and consider a movie as "popular" only if it has 30 fan ratings or more on Fandango's website.
We know that the 2015 dataset contains 146 rows, so we can check how many of those movies are non-popular:
# Checking for popularity in 2015 dataset
display("Number of non-popular movies in 2015 dataset: ", 146 - (fandango15["Fandango_votes"] < 30).value_counts()[0])
'Number of non-popular movies in 2015 dataset: '
0
However, the 2016/17 doesn't contain any information on the number of votes. Therefore, we'll check on Fandango's website a random sample from those movies their number of reviews:
display(fandango1617["movie"].sample(10, random_state=0))
197 The Take (Bastille Day) 37 Come and Find Me 89 Kickboxer 176 The Founder 170 The Darkness 75 Ice Age: Collision Course 96 Lion 137 Ride Along 2 5 A Monster Calls 83 Jane Got a Gun Name: movie, dtype: object
We would have looked on Fandango's website each of these sample films to see their number of Fandango users' reviews. However, it seems that Fandango has changed its rating systems, so it currently shows Rotten Tomatoes score:
This could actually end our project here, as we have found that Fandango has changed his whole system to show Rotten Tomatoes scores. However, for the sake of learning, we'll continue as if the movies had enough popularity.
We'll continue by keeping only movies relased on 2015 and 2016 from the 2015 and 2016/17 datasets, respectively. The former still needs a "year" column, while the approach is more straightforward on the latter:
# Creating a "year" column for the dataset from 2015
fandango15["year"] = fandango15["FILM"].str.extract(r"\(([0-9]{4})\)").astype(int)
# Keeping only movies from 2015
fandango15 = fandango15[fandango15["year"] == 2015].copy().reset_index()
# Display results with new column
display(fandango15.head())
# Keeping only movies from 2016
fandango16 = fandango1617[fandango1617["year"] == 2016].copy().reset_index()
index | FILM | Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | year | |
---|---|---|---|---|---|---|---|
0 | 0 | Avengers: Age of Ultron (2015) | 5.0 | 4.5 | 14846 | 0.5 | 2015 |
1 | 1 | Cinderella (2015) | 5.0 | 4.5 | 12640 | 0.5 | 2015 |
2 | 2 | Ant-Man (2015) | 5.0 | 4.5 | 12055 | 0.5 | 2015 |
3 | 3 | Do You Believe? (2015) | 5.0 | 4.5 | 1793 | 0.5 | 2015 |
4 | 4 | Hot Tub Time Machine 2 (2015) | 3.5 | 3.0 | 1021 | 0.5 | 2015 |
Now that we have both datasets prepared as we need them, we'll continue by analyzing their differences, so we can conclude whether Fandango solved their rating issue.
Let's begin by comparing the distribution of the presented scores in 2015 and 2016:
# Importing the FiveThirtyEight style
plt.style.use('fivethirtyeight')
# Size of the plot
plt.figure(figsize=(8,5))
# Data to be plotted
fandango15["Fandango_Stars"].plot.kde(ind=np.arange(0,5.5,0.5))
fandango16["fandango"].plot.kde(ind=np.arange(0,5.5,0.5))
# Labels and titles adjustments
plt.xticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.title("Fandango reviews comparison", fontsize=20)
plt.xlabel("Stars")
plt.ylabel("Density proportion")
plt.legend(labels=[2015, 2016])
plt.show()
The plot above shows two interesting points:
Therefore, as the distribution from 2016 scores is displaced to the left, in a similar way than the computed scores from 2015 were, we can conclude that Fandango corrected the main issue that we've been discussing along the present project.
We'll look at the frequency tables to confirm this hypothesis:
freq_dist15 = round(fandango15["Fandango_Stars"].value_counts(normalize=True).sort_index() * 100, 1)
freq_dist16 = round(fandango16["fandango"].value_counts(normalize=True).sort_index() * 100, 1)
display("Frequency distribution table for 2015 dataset:", freq_dist15)
display("Frequency distribution table for 2016 dataset:", freq_dist16)
'Frequency distribution table for 2015 dataset:'
3.0 8.5 3.5 17.8 4.0 28.7 4.5 38.0 5.0 7.0 Name: Fandango_Stars, dtype: float64
'Frequency distribution table for 2016 dataset:'
2.5 3.1 3.0 7.3 3.5 24.1 4.0 40.3 4.5 24.6 5.0 0.5 Name: fandango, dtype: float64
Again, the frequency distribution tables confirm our hypothesis that Fandango corrected its presented scores system:
We'll continue our exhaustive analysis on the aforementioned difference with a comparison on the mean, median and mode for each dataset:
# Generate a dataframe with the statistics we need
statistics = {
"year": [2015, 2015, 2015, 2016, 2016, 2016],
"statistic": ["mean", "median", "mode", "mean", "median", "mode"],
"value": [fandango15["Fandango_Stars"].mean(), fandango15["Fandango_Stars"].median(), fandango15["Fandango_Stars"].mode()[0], fandango16["fandango"].mean(), fandango16["fandango"].median(), fandango16["fandango"].mode()[0]]
}
stats = pd.DataFrame(data=statistics)
display(stats)
# Plot in a grouped bar plot these statistics
sns.catplot(kind="bar", x="statistic", hue="year", y="value", data=stats)
plt.title("Comparing summary statistics: 2015 vs 2016")
plt.yticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.ylabel("Stars")
plt.xlabel("")
plt.show()
year | statistic | value | |
---|---|---|---|
0 | 2015 | mean | 4.085271 |
1 | 2015 | median | 4.000000 |
2 | 2015 | mode | 4.500000 |
3 | 2016 | mean | 3.887435 |
4 | 2016 | median | 4.000000 |
5 | 2016 | mode | 4.000000 |
The summary statistics above show:
Therefore, these statistics support our hypothesis that Fandango corrected its problem with the movie scores.
The analysis conducted in the present project showed that Fandango movie reviews in 2016 were slightly but consistently lower compared with 2015.
Therefore, we can conclude that Fandango properly addressed their issue with the scores, and made these fairer.