My First Data Science Project

Helicopter Escapes!

We begin by importing some helper functions

In [3]:
from helper import *

Get the Data

Now, let's get the data from the List of helicopter prison escapes Wikipedia article.

In [4]:
url = 'https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes'
In [5]:
data = data_from_url(url)

Let's print the first three rows

In [6]:
index = 0
for row in data:
    row[0] = fetch_year(row[0])
    data[index] = row[:-1]
    index += 1

print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
In [11]:
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
print(min_year)
print(max_year)
1971
2020
In [12]:
years = []
for y in range(min_year, max_year + 1):
    years.append(y)
print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
In [13]:
attempts_per_year = []
for y in years:
    attempts_per_year.append([y,0])
print(attempts_per_year)    
[[1971, 0], [1972, 0], [1973, 0], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 0], [1979, 0], [1980, 0], [1981, 0], [1982, 0], [1983, 0], [1984, 0], [1985, 0], [1986, 0], [1987, 0], [1988, 0], [1989, 0], [1990, 0], [1991, 0], [1992, 0], [1993, 0], [1994, 0], [1995, 0], [1996, 0], [1997, 0], [1998, 0], [1999, 0], [2000, 0], [2001, 0], [2002, 0], [2003, 0], [2004, 0], [2005, 0], [2006, 0], [2007, 0], [2008, 0], [2009, 0], [2010, 0], [2011, 0], [2012, 0], [2013, 0], [2014, 0], [2015, 0], [2016, 0], [2017, 0], [2018, 0], [2019, 0], [2020, 0]]
In [20]:
for row in data:
    for ya in attempts_per_year: # Instruction 2 - nothing to do here
        # Instruction 3 - assign the year value in ya to y
        y = ya[0]
        if row[0] == y:
            ya[1] += 1

# Instruction 4 - print the results
print(attempts_per_year)
[[1971, 2], [1972, 1], [1973, 2], [1974, 1], [1975, 1], [1976, 1], [1977, 1], [1978, 2], [1979, 1], [1980, 1], [1981, 3], [1982, 1], [1983, 2], [1984, 1], [1985, 3], [1986, 4], [1987, 2], [1988, 2], [1989, 3], [1990, 2], [1991, 2], [1992, 3], [1993, 2], [1994, 1], [1995, 1], [1996, 2], [1997, 2], [1998, 1], [1999, 2], [2000, 3], [2001, 4], [2002, 3], [2003, 2], [2004, 1], [2005, 3], [2006, 2], [2007, 4], [2008, 1], [2009, 4], [2010, 2], [2011, 1], [2012, 2], [2013, 3], [2014, 2], [2015, 1], [2016, 2], [2017, 1], [2018, 2], [2019, 1], [2020, 2]]
In [21]:
%matplotlib inline
barplot(attempts_per_year)

The years in which the most helicopter prison break attempts occurred were 1986, 2001, 2007 and 2009, with a total of three attempts each.

In [22]:
countries_frequency = df["Country"].value_counts()
print_pretty_table(countries_frequency)
Country Number of Occurrences
France 15
United States 8
Belgium 4
Greece 4
Canada 4
Brazil 2
United Kingdom 2
Australia 2
Ireland 1
Italy 1
Chile 1
Mexico 1
Puerto Rico 1
Russia 1
Netherlands 1

By and far, the country with the most helicopter prison escape attempts is France.