Investigating Fandango Movie Ratings¶

In October 2015, a data journalist named Walt Hickey analyzed movie ratings in Fandango - a website that sells movie tickets and allows users to score-review movies. Walt Hickey found that Fandango's reviews that are shown to users are far from fair (the details of his analysis can be found on this article):

Fandango's review scale consists on a 0-5 star score, with half-star steps: users can score a movie using this scale, and they'll be shown the score of a movie when looking its review. However, Walt Hickey found that Fandango's presented score is always rounded up to the nearest half-star! Therefore, a movie with an average score of 4.1 would be rounded up to 4.5 stars, instead of the fairer 4.0 stars.

The following plot shows how the displayed score distribution is clearly displaced to the right compared to the actual score distribution:

The final response from Fandango was that this issue was caused by a bug, and that they would fix it as soon as possible.

The goal of this project is to evaluate whether Fandango has corrected his system to present movie scores.

1) Getting the data¶

Walt Hickey made the data he analyzed publicly available on GitHub. We'll use the data he collected to analyze the characteristics of Fandango's rating system previous to his analysis.
One of Dataquest's team members collected movie ratings data for movies released in 2016 and 2017. The data is publicly available on GitHub and we'll use it to analyze the rating system's characteristics after Hickey's analysis.

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:

# Reading the data
scores15 = pd.read_csv("fandango_score_comparison.csv")
scores1617 = pd.read_csv("movie_ratings_16_17.csv")

# Showing basic info
display("Scores from 2015 shape and descriptives:", scores15.shape, scores15.columns)
display("Scores from 2016/17 shape and descriptives:", scores1617.shape, scores1617.columns)

'Scores from 2015 shape and descriptives:'

(146, 22)

Index(['FILM', 'RottenTomatoes', 'RottenTomatoes_User', 'Metacritic',
       'Metacritic_User', 'IMDB', 'Fandango_Stars', 'Fandango_Ratingvalue',
       'RT_norm', 'RT_user_norm', 'Metacritic_norm', 'Metacritic_user_nom',
       'IMDB_norm', 'RT_norm_round', 'RT_user_norm_round',
       'Metacritic_norm_round', 'Metacritic_user_norm_round',
       'IMDB_norm_round', 'Metacritic_user_vote_count', 'IMDB_user_vote_count',
       'Fandango_votes', 'Fandango_Difference'],
      dtype='object')

'Scores from 2016/17 shape and descriptives:'

(214, 15)

Index(['movie', 'year', 'metascore', 'imdb', 'tmeter', 'audience', 'fandango',
       'n_metascore', 'n_imdb', 'n_tmeter', 'n_audience', 'nr_metascore',
       'nr_imdb', 'nr_tmeter', 'nr_audience'],
      dtype='object')

2) Data cleaning¶

2.1) Columns of interest¶

As we're not interested in every column from these datasets, we'll keep only those related to Fandango's website. This is, we'll keep from the original 2015 dataset the following columns:

FILM: The film in question
Fandango_Stars: The number of stars the film had on its Fandango movie page
Fandango_Ratingvalue: The Fandango ratingValue for the film, as pulled from the HTML of each page. This is the actual average score the movie obtained.
Fandango_votes: The number of user votes the film had on Fandango
Fandango_Difference: The difference between the presented Fandango_Stars and the actual Fandango_Ratingvalue

Then, from the 2016/16 dataset we'll keep the following columns:

movie: The name of the movie
year: The release year of the movie
fandango: The Fandango rating of the movie (user score)

In [3]:

# Keeping the columns we need
fandango15 = scores15[["FILM", "Fandango_Stars", "Fandango_Ratingvalue", "Fandango_votes", "Fandango_Difference"]].copy()
fandango1617 = scores1617[["movie", "year", "fandango"]]

# Showing basic info
display("Scores from Fandango 2015 shape and descriptives:", fandango15.shape, fandango15.describe(), fandango15.head())
display("Scores from Fandango 2016/17 shape and descriptives:", fandango1617.shape, fandango1617.describe(), fandango1617.head())

'Scores from Fandango 2015 shape and descriptives:'

(146, 5)

	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference
count	146.000000	146.000000	146.000000	146.000000
mean	4.089041	3.845205	3848.787671	0.243836
std	0.540386	0.502831	6357.778617	0.152665
min	3.000000	2.700000	35.000000	0.000000
25%	3.500000	3.500000	222.250000	0.100000
50%	4.000000	3.900000	1446.000000	0.200000
75%	4.500000	4.200000	4439.500000	0.400000
max	5.000000	4.800000	34846.000000	0.500000

	FILM	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference
0	Avengers: Age of Ultron (2015)	5.0	4.5	14846	0.5
1	Cinderella (2015)	5.0	4.5	12640	0.5
2	Ant-Man (2015)	5.0	4.5	12055	0.5
3	Do You Believe? (2015)	5.0	4.5	1793	0.5
4	Hot Tub Time Machine 2 (2015)	3.5	3.0	1021	0.5

'Scores from Fandango 2016/17 shape and descriptives:'

(214, 3)

	year	fandango
count	214.000000	214.000000
mean	2016.107477	3.894860
std	0.310444	0.516781
min	2016.000000	2.500000
25%	2016.000000	3.500000
50%	2016.000000	4.000000
75%	2016.000000	4.500000
max	2017.000000	5.000000

	movie	year	fandango
0	10 Cloverfield Lane	2016	3.5
1	13 Hours	2016	4.5
2	A Cure for Wellness	2016	3.0
3	A Dog's Purpose	2017	4.5
4	A Hologram for the King	2016	3.0

* Updating the project goal¶

Reading the README.md files from each dataset, we see how each dataset was obtained:

2015 dataset: fandango_score_comparison.csv contains every film that has a Rotten Tomatoes rating, a RT User rating, a Metacritic score, a Metacritic User score, and IMDb score, and at least 30 fan reviews on Fandango. The data from Fandango was pulled on Aug. 24, 2015.
2016/17 dataset: movie_ratings_16_17.csv contains movie ratings data for 214 of the most popular movies (with a significant number of votes) released in 2016 and 2017. As of March 22, 2017, the ratings were up to date. Significant changes should be expected mostly for movies released in 2017.

This is, the 2016/17 data set might not be the best dataset, as it is only including the most popular movies, and also the movies from 2017 could significantly change their reviews if properly updated.

Therefore, we'll update our goal so we'll check whether Fandango's system rating had improved by looking at most popular movies from 2016.

2.2) Popularity of the movies¶

The term "popular" is vague and we need to define it with precision before continuing. We'll use Hickey's benchmark of 30 fan ratings and consider a movie as "popular" only if it has 30 fan ratings or more on Fandango's website.

We know that the 2015 dataset contains 146 rows, so we can check how many of those movies are non-popular:

In [4]:

# Checking for popularity in 2015 dataset
display("Number of non-popular movies in 2015 dataset: ", 146 - (fandango15["Fandango_votes"] < 30).value_counts()[0])

'Number of non-popular movies in 2015 dataset: '

However, the 2016/17 doesn't contain any information on the number of votes. Therefore, we'll check on Fandango's website a random sample from those movies their number of reviews:

In [5]:

display(fandango1617["movie"].sample(10, random_state=0))

197      The Take (Bastille Day)
37              Come and Find Me
89                     Kickboxer
176                  The Founder
170                 The Darkness
75     Ice Age: Collision Course
96                          Lion
137                 Ride Along 2
5                A Monster Calls
83                Jane Got a Gun
Name: movie, dtype: object

We would have looked on Fandango's website each of these sample films to see their number of Fandango users' reviews. However, it seems that Fandango has changed its rating systems, so it currently shows Rotten Tomatoes score:

This could actually end our project here, as we have found that Fandango has changed his whole system to show Rotten Tomatoes scores. However, for the sake of learning, we'll continue as if the movies had enough popularity.

2.3) Release years¶

We'll continue by keeping only movies relased on 2015 and 2016 from the 2015 and 2016/17 datasets, respectively. The former still needs a "year" column, while the approach is more straightforward on the latter:

In [6]:

# Creating a "year" column for the dataset from 2015
fandango15["year"] = fandango15["FILM"].str.extract(r"\(([0-9]{4})\)").astype(int)

# Keeping only movies from 2015
fandango15 = fandango15[fandango15["year"] == 2015].copy().reset_index()

# Display results with new column
display(fandango15.head())



# Keeping only movies from 2016
fandango16 = fandango1617[fandango1617["year"] == 2016].copy().reset_index()

	index	FILM	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference	year
0	0	Avengers: Age of Ultron (2015)	5.0	4.5	14846	0.5	2015
1	1	Cinderella (2015)	5.0	4.5	12640	0.5	2015
2	2	Ant-Man (2015)	5.0	4.5	12055	0.5	2015
3	3	Do You Believe? (2015)	5.0	4.5	1793	0.5	2015
4	4	Hot Tub Time Machine 2 (2015)	3.5	3.0	1021	0.5	2015

3) Data analysis¶

Now that we have both datasets prepared as we need them, we'll continue by analyzing their differences, so we can conclude whether Fandango solved their rating issue.

3.1) Proportions distributions comparison¶

Let's begin by comparing the distribution of the presented scores in 2015 and 2016:

In [7]:

# Importing the FiveThirtyEight style
plt.style.use('fivethirtyeight')

# Size of the plot
plt.figure(figsize=(8,5))

# Data to be plotted
fandango15["Fandango_Stars"].plot.kde(ind=np.arange(0,5.5,0.5))
fandango16["fandango"].plot.kde(ind=np.arange(0,5.5,0.5))

# Labels and titles adjustments
plt.xticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.title("Fandango reviews comparison", fontsize=20)
plt.xlabel("Stars")
plt.ylabel("Density proportion")
plt.legend(labels=[2015, 2016])

plt.show()

The plot above shows two interesting points:

Both distributions resemble a normal distribution, centered around their higher value (4.5 for 2015, and 4.0 for 2016).
The distribution from the 2016 dataset is displaced to the left, resembling what happened on the original analysis from Walt Hickey using only data from 2015 (presented vs actual computed score) - shown again below:

Therefore, as the distribution from 2016 scores is displaced to the left, in a similar way than the computed scores from 2015 were, we can conclude that Fandango corrected the main issue that we've been discussing along the present project.

3.2) Frequency distributions comparison¶

We'll look at the frequency tables to confirm this hypothesis:

In [8]:

freq_dist15 = round(fandango15["Fandango_Stars"].value_counts(normalize=True).sort_index() * 100, 1)
freq_dist16 = round(fandango16["fandango"].value_counts(normalize=True).sort_index() * 100, 1)

display("Frequency distribution table for 2015 dataset:", freq_dist15)
display("Frequency distribution table for 2016 dataset:", freq_dist16)

'Frequency distribution table for 2015 dataset:'

3.0     8.5
3.5    17.8
4.0    28.7
4.5    38.0
5.0     7.0
Name: Fandango_Stars, dtype: float64

'Frequency distribution table for 2016 dataset:'

2.5     3.1
3.0     7.3
3.5    24.1
4.0    40.3
4.5    24.6
5.0     0.5
Name: fandango, dtype: float64

Again, the frequency distribution tables confirm our hypothesis that Fandango corrected its presented scores system:

3.1% of the movies from 2016 had a 2.5 score, while none had this score in 2015.
Percentages below or equal to 4.0 tend to be higher for the 2016 dataset, while percentages higher than 4.0 are lower for the 2016 dataset.
Only 0.5% of the movies from 2016 had a 5.0 score, while 7.0% had this score in 2015.

3.3) Summary statistics comparison¶

We'll continue our exhaustive analysis on the aforementioned difference with a comparison on the mean, median and mode for each dataset:

In [9]:

# Generate a dataframe with the statistics we need
statistics = {
    "year": [2015, 2015, 2015, 2016, 2016, 2016],
    "statistic": ["mean", "median", "mode", "mean", "median", "mode"],
    "value": [fandango15["Fandango_Stars"].mean(), fandango15["Fandango_Stars"].median(), fandango15["Fandango_Stars"].mode()[0], fandango16["fandango"].mean(), fandango16["fandango"].median(), fandango16["fandango"].mode()[0]]
}

stats = pd.DataFrame(data=statistics)

display(stats)

# Plot in a grouped bar plot these statistics
sns.catplot(kind="bar", x="statistic", hue="year", y="value", data=stats)
plt.title("Comparing summary statistics: 2015 vs 2016")
plt.yticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.ylabel("Stars")
plt.xlabel("")
plt.show()

	year	statistic	value
0	2015	mean	4.085271
1	2015	median	4.000000
2	2015	mode	4.500000
3	2016	mean	3.887435
4	2016	median	4.000000
5	2016	mode	4.000000

The summary statistics above show:

The mean stars score for 2015 was slighty higher than 2016: 4.1 vs 3.9
The median stars score for both years was identical: 4.0
The mode stars score for 2015 was higher than for 2016: 4.5 vs 4.0

Therefore, these statistics support our hypothesis that Fandango corrected its problem with the movie scores.

4) Conclussions¶

The analysis conducted in the present project showed that Fandango movie reviews in 2016 were slightly but consistently lower compared with 2015.

Therefore, we can conclude that Fandango properly addressed their issue with the scores, and made these fairer.