Investigating Fandango Movie Ratings

In October 2015, a data journalist named Walt Hickey analyzed movie ratings in Fandango - a website that sells movie tickets and allows users to score-review movies. Walt Hickey found that Fandango's reviews that are shown to users are far from fair (the details of his analysis can be found on this article):

Fandango's review scale consists on a 0-5 star score, with half-star steps: users can score a movie using this scale, and they'll be shown the score of a movie when looking its review. However, Walt Hickey found that Fandango's presented score is always rounded up to the nearest half-star! Therefore, a movie with an average score of 4.1 would be rounded up to 4.5 stars, instead of the fairer 4.0 stars.

The following plot shows how the displayed score distribution is clearly displaced to the right compared to the actual score distribution:

hickey-datalab-fandango-3.webp

The final response from Fandango was that this issue was caused by a bug, and that they would fix it as soon as possible.

The goal of this project is to evaluate whether Fandango has corrected his system to present movie scores.

1) Getting the data

  • Walt Hickey made the data he analyzed publicly available on GitHub. We'll use the data he collected to analyze the characteristics of Fandango's rating system previous to his analysis.
  • One of Dataquest's team members collected movie ratings data for movies released in 2016 and 2017. The data is publicly available on GitHub and we'll use it to analyze the rating system's characteristics after Hickey's analysis.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [2]:
# Reading the data
scores15 = pd.read_csv("fandango_score_comparison.csv")
scores1617 = pd.read_csv("movie_ratings_16_17.csv")

# Showing basic info
display("Scores from 2015 shape and descriptives:", scores15.shape, scores15.columns)
display("Scores from 2016/17 shape and descriptives:", scores1617.shape, scores1617.columns)
'Scores from 2015 shape and descriptives:'
(146, 22)
Index(['FILM', 'RottenTomatoes', 'RottenTomatoes_User', 'Metacritic',
       'Metacritic_User', 'IMDB', 'Fandango_Stars', 'Fandango_Ratingvalue',
       'RT_norm', 'RT_user_norm', 'Metacritic_norm', 'Metacritic_user_nom',
       'IMDB_norm', 'RT_norm_round', 'RT_user_norm_round',
       'Metacritic_norm_round', 'Metacritic_user_norm_round',
       'IMDB_norm_round', 'Metacritic_user_vote_count', 'IMDB_user_vote_count',
       'Fandango_votes', 'Fandango_Difference'],
      dtype='object')
'Scores from 2016/17 shape and descriptives:'
(214, 15)
Index(['movie', 'year', 'metascore', 'imdb', 'tmeter', 'audience', 'fandango',
       'n_metascore', 'n_imdb', 'n_tmeter', 'n_audience', 'nr_metascore',
       'nr_imdb', 'nr_tmeter', 'nr_audience'],
      dtype='object')

2) Data cleaning

2.1) Columns of interest

As we're not interested in every column from these datasets, we'll keep only those related to Fandango's website. This is, we'll keep from the original 2015 dataset the following columns:

  • FILM: The film in question
  • Fandango_Stars: The number of stars the film had on its Fandango movie page
  • Fandango_Ratingvalue: The Fandango ratingValue for the film, as pulled from the HTML of each page. This is the actual average score the movie obtained.
  • Fandango_votes: The number of user votes the film had on Fandango
  • Fandango_Difference: The difference between the presented Fandango_Stars and the actual Fandango_Ratingvalue

Then, from the 2016/16 dataset we'll keep the following columns:

  • movie: The name of the movie
  • year: The release year of the movie
  • fandango: The Fandango rating of the movie (user score)
In [3]:
# Keeping the columns we need
fandango15 = scores15[["FILM", "Fandango_Stars", "Fandango_Ratingvalue", "Fandango_votes", "Fandango_Difference"]].copy()
fandango1617 = scores1617[["movie", "year", "fandango"]]

# Showing basic info
display("Scores from Fandango 2015 shape and descriptives:", fandango15.shape, fandango15.describe(), fandango15.head())
display("Scores from Fandango 2016/17 shape and descriptives:", fandango1617.shape, fandango1617.describe(), fandango1617.head())
'Scores from Fandango 2015 shape and descriptives:'
(146, 5)
Fandango_Stars Fandango_Ratingvalue Fandango_votes Fandango_Difference
count 146.000000 146.000000 146.000000 146.000000
mean 4.089041 3.845205 3848.787671 0.243836
std 0.540386 0.502831 6357.778617 0.152665
min 3.000000 2.700000 35.000000 0.000000
25% 3.500000 3.500000 222.250000 0.100000
50% 4.000000 3.900000 1446.000000 0.200000
75% 4.500000 4.200000 4439.500000 0.400000
max 5.000000 4.800000 34846.000000 0.500000
FILM Fandango_Stars Fandango_Ratingvalue Fandango_votes Fandango_Difference
0 Avengers: Age of Ultron (2015) 5.0 4.5 14846 0.5
1 Cinderella (2015) 5.0 4.5 12640 0.5
2 Ant-Man (2015) 5.0 4.5 12055 0.5
3 Do You Believe? (2015) 5.0 4.5 1793 0.5
4 Hot Tub Time Machine 2 (2015) 3.5 3.0 1021 0.5
'Scores from Fandango 2016/17 shape and descriptives:'
(214, 3)
year fandango
count 214.000000 214.000000
mean 2016.107477 3.894860
std 0.310444 0.516781
min 2016.000000 2.500000
25% 2016.000000 3.500000
50% 2016.000000 4.000000
75% 2016.000000 4.500000
max 2017.000000 5.000000
movie year fandango
0 10 Cloverfield Lane 2016 3.5
1 13 Hours 2016 4.5
2 A Cure for Wellness 2016 3.0
3 A Dog's Purpose 2017 4.5
4 A Hologram for the King 2016 3.0

* Updating the project goal

Reading the README.md files from each dataset, we see how each dataset was obtained:

  • 2015 dataset: fandango_score_comparison.csv contains every film that has a Rotten Tomatoes rating, a RT User rating, a Metacritic score, a Metacritic User score, and IMDb score, and at least 30 fan reviews on Fandango. The data from Fandango was pulled on Aug. 24, 2015.
  • 2016/17 dataset: movie_ratings_16_17.csv contains movie ratings data for 214 of the most popular movies (with a significant number of votes) released in 2016 and 2017. As of March 22, 2017, the ratings were up to date. Significant changes should be expected mostly for movies released in 2017.

This is, the 2016/17 data set might not be the best dataset, as it is only including the most popular movies, and also the movies from 2017 could significantly change their reviews if properly updated.

Therefore, we'll update our goal so we'll check whether Fandango's system rating had improved by looking at most popular movies from 2016.

2.2) Popularity of the movies

The term "popular" is vague and we need to define it with precision before continuing. We'll use Hickey's benchmark of 30 fan ratings and consider a movie as "popular" only if it has 30 fan ratings or more on Fandango's website.

We know that the 2015 dataset contains 146 rows, so we can check how many of those movies are non-popular:

In [4]:
# Checking for popularity in 2015 dataset
display("Number of non-popular movies in 2015 dataset: ", 146 - (fandango15["Fandango_votes"] < 30).value_counts()[0])
'Number of non-popular movies in 2015 dataset: '
0

However, the 2016/17 doesn't contain any information on the number of votes. Therefore, we'll check on Fandango's website a random sample from those movies their number of reviews:

In [5]:
display(fandango1617["movie"].sample(10, random_state=0))
197      The Take (Bastille Day)
37              Come and Find Me
89                     Kickboxer
176                  The Founder
170                 The Darkness
75     Ice Age: Collision Course
96                          Lion
137                 Ride Along 2
5                A Monster Calls
83                Jane Got a Gun
Name: movie, dtype: object

We would have looked on Fandango's website each of these sample films to see their number of Fandango users' reviews. However, it seems that Fandango has changed its rating systems, so it currently shows Rotten Tomatoes score:

Fandango_rotten.png

This could actually end our project here, as we have found that Fandango has changed his whole system to show Rotten Tomatoes scores. However, for the sake of learning, we'll continue as if the movies had enough popularity.

2.3) Release years

We'll continue by keeping only movies relased on 2015 and 2016 from the 2015 and 2016/17 datasets, respectively. The former still needs a "year" column, while the approach is more straightforward on the latter:

In [6]:
# Creating a "year" column for the dataset from 2015
fandango15["year"] = fandango15["FILM"].str.extract(r"\(([0-9]{4})\)").astype(int)

# Keeping only movies from 2015
fandango15 = fandango15[fandango15["year"] == 2015].copy().reset_index()

# Display results with new column
display(fandango15.head())



# Keeping only movies from 2016
fandango16 = fandango1617[fandango1617["year"] == 2016].copy().reset_index()
index FILM Fandango_Stars Fandango_Ratingvalue Fandango_votes Fandango_Difference year
0 0 Avengers: Age of Ultron (2015) 5.0 4.5 14846 0.5 2015
1 1 Cinderella (2015) 5.0 4.5 12640 0.5 2015
2 2 Ant-Man (2015) 5.0 4.5 12055 0.5 2015
3 3 Do You Believe? (2015) 5.0 4.5 1793 0.5 2015
4 4 Hot Tub Time Machine 2 (2015) 3.5 3.0 1021 0.5 2015

3) Data analysis

Now that we have both datasets prepared as we need them, we'll continue by analyzing their differences, so we can conclude whether Fandango solved their rating issue.

3.1) Proportions distributions comparison

Let's begin by comparing the distribution of the presented scores in 2015 and 2016:

In [7]:
# Importing the FiveThirtyEight style
plt.style.use('fivethirtyeight')

# Size of the plot
plt.figure(figsize=(8,5))

# Data to be plotted
fandango15["Fandango_Stars"].plot.kde(ind=np.arange(0,5.5,0.5))
fandango16["fandango"].plot.kde(ind=np.arange(0,5.5,0.5))

# Labels and titles adjustments
plt.xticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.title("Fandango reviews comparison", fontsize=20)
plt.xlabel("Stars")
plt.ylabel("Density proportion")
plt.legend(labels=[2015, 2016])

plt.show()

The plot above shows two interesting points:

  1. Both distributions resemble a normal distribution, centered around their higher value (4.5 for 2015, and 4.0 for 2016).
  2. The distribution from the 2016 dataset is displaced to the left, resembling what happened on the original analysis from Walt Hickey using only data from 2015 (presented vs actual computed score) - shown again below:

hickey-datalab-fandango-3.webp

Therefore, as the distribution from 2016 scores is displaced to the left, in a similar way than the computed scores from 2015 were, we can conclude that Fandango corrected the main issue that we've been discussing along the present project.

3.2) Frequency distributions comparison

We'll look at the frequency tables to confirm this hypothesis:

In [8]:
freq_dist15 = round(fandango15["Fandango_Stars"].value_counts(normalize=True).sort_index() * 100, 1)
freq_dist16 = round(fandango16["fandango"].value_counts(normalize=True).sort_index() * 100, 1)

display("Frequency distribution table for 2015 dataset:", freq_dist15)
display("Frequency distribution table for 2016 dataset:", freq_dist16)
'Frequency distribution table for 2015 dataset:'
3.0     8.5
3.5    17.8
4.0    28.7
4.5    38.0
5.0     7.0
Name: Fandango_Stars, dtype: float64
'Frequency distribution table for 2016 dataset:'
2.5     3.1
3.0     7.3
3.5    24.1
4.0    40.3
4.5    24.6
5.0     0.5
Name: fandango, dtype: float64

Again, the frequency distribution tables confirm our hypothesis that Fandango corrected its presented scores system:

  • 3.1% of the movies from 2016 had a 2.5 score, while none had this score in 2015.
  • Percentages below or equal to 4.0 tend to be higher for the 2016 dataset, while percentages higher than 4.0 are lower for the 2016 dataset.
  • Only 0.5% of the movies from 2016 had a 5.0 score, while 7.0% had this score in 2015.

3.3) Summary statistics comparison

We'll continue our exhaustive analysis on the aforementioned difference with a comparison on the mean, median and mode for each dataset:

In [9]:
# Generate a dataframe with the statistics we need
statistics = {
    "year": [2015, 2015, 2015, 2016, 2016, 2016],
    "statistic": ["mean", "median", "mode", "mean", "median", "mode"],
    "value": [fandango15["Fandango_Stars"].mean(), fandango15["Fandango_Stars"].median(), fandango15["Fandango_Stars"].mode()[0], fandango16["fandango"].mean(), fandango16["fandango"].median(), fandango16["fandango"].mode()[0]]
}

stats = pd.DataFrame(data=statistics)

display(stats)

# Plot in a grouped bar plot these statistics
sns.catplot(kind="bar", x="statistic", hue="year", y="value", data=stats)
plt.title("Comparing summary statistics: 2015 vs 2016")
plt.yticks(ticks=np.arange(0,5.5,0.5), labels=np.arange(0,5.5,0.5))
plt.ylabel("Stars")
plt.xlabel("")
plt.show()
year statistic value
0 2015 mean 4.085271
1 2015 median 4.000000
2 2015 mode 4.500000
3 2016 mean 3.887435
4 2016 median 4.000000
5 2016 mode 4.000000

The summary statistics above show:

  1. The mean stars score for 2015 was slighty higher than 2016: 4.1 vs 3.9
  2. The median stars score for both years was identical: 4.0
  3. The mode stars score for 2015 was higher than for 2016: 4.5 vs 4.0

Therefore, these statistics support our hypothesis that Fandango corrected its problem with the movie scores.

4) Conclussions

The analysis conducted in the present project showed that Fandango movie reviews in 2016 were **slightly but consistently lower** compared with 2015.

Therefore, we can conclude that Fandango properly addressed their issue with the scores, and made these fairer.