Welcome to my first Dataquest Guided Project! A project about prison breaks! In this project, I intend to do the following:
First, we will begin importing some helper functions:
from helper import *
We will obtain data for this analysis from the List of helicopter prison escapes Wikipedia article
url = 'https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes'
data = data_from_url(url)
Evaluating the first three rows of the data imported
for row in data[:3]:
print(row)
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro', "Joel David Kaplan was a New York businessman who had been arrested for murder in 1962 in Mexico City and was incarcerated at the Santa Martha Acatitla prison in the Iztapalapa borough of Mexico City. Joel's sister, Judy Kaplan, arranged the means to help Kaplan escape, and on August 19, 1971, a helicopter landed in the prison yard. The guards mistakenly thought this was an official visit. In two minutes, Kaplan and his cellmate Carlos Antonio Contreras, a Venezuelan counterfeiter, were able to board the craft and were piloted away, before any shots were fired.[9] Both men were flown to Texas and then different planes flew Kaplan to California and Castro to Guatemala.[3] The Mexican government never initiated extradition proceedings against Kaplan.[9] The escape is told in a book, The 10-Second Jailbreak: The Helicopter Escape of Joel David Kaplan.[4] It also inspired the 1975 action movie Breakout, which starred Charles Bronson and Robert Duvall.[9]"] ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon", 'On October 31, 1973 an IRA member hijacked a helicopter and forced the pilot to land in the exercise yard of Dublin\'s Mountjoy Jail\'s D Wing at 3:40\xa0p.m., October 31, 1973. Three members of the IRA were able to escape: JB O\'Hagan, Seamus Twomey and Kevin Mallon. Another prisoner who also was in the prison was quoted as saying, "One shamefaced screw apologised to the governor and said he thought it was the new Minister for Defence (Paddy Donegan) arriving. I told him it was our Minister of Defence leaving." The Mountjoy helicopter escape became Republican lore and was immortalized by "The Helicopter Song", which contains the lines "It\'s up like a bird and over the city. There\'s three men a\'missing I heard the warder say".[1]'] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson', "43-year-old Barbara Ann Oswald hijacked a Saint Louis-based charter helicopter and forced the pilot to land in the yard at USP Marion. While landing the aircraft, the pilot, Allen Barklage, who was a Vietnam War veteran, struggled with Oswald and managed to wrestle the gun away from her. Barklage then shot and killed Oswald, thwarting the escape.[10] A few months later Oswald's daughter hijacked TWA Flight 541 in an effort to free Trapnell."]
The details make this a bit difficult to read I will remove them for now
index = 0
for row in data:
data[index] = row[:-1]
index +=1
print(data[:3])
[['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
In which year did the most attempts at breaking out of prison with a helicopter occur?
In which countries do the most attempted helicopter prison escapes occur?
Which countries have the highest chances of helicopter escape success
From each row in the dataset, we will extract only the year from the date:
for row in data:
row[0] = fetch_year(row[0])
print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
Let's identify the earliest and latest years in the dataset, then store the entire range of years in a variable. We will call the variable - 'years'
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
years = []
for y in range(min_year, max_year + 1):
years.append(y)
Let's intialize a list of lists that will help us record the history of attempts per year
attempts_per_year = []
for year in years:
attempts_per_year.append([year, 0])
print(attempts_per_year)
[[1971, 0], [1972, 0], [1973, 0], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 0], [1979, 0], [1980, 0], [1981, 0], [1982, 0], [1983, 0], [1984, 0], [1985, 0], [1986, 0], [1987, 0], [1988, 0], [1989, 0], [1990, 0], [1991, 0], [1992, 0], [1993, 0], [1994, 0], [1995, 0], [1996, 0], [1997, 0], [1998, 0], [1999, 0], [2000, 0], [2001, 0], [2002, 0], [2003, 0], [2004, 0], [2005, 0], [2006, 0], [2007, 0], [2008, 0], [2009, 0], [2010, 0], [2011, 0], [2012, 0], [2013, 0], [2014, 0], [2015, 0], [2016, 0], [2017, 0], [2018, 0], [2019, 0], [2020, 0]]
Estimate the number of breakouts per year
for year in attempts_per_year:
for row in data:
if year[0] == row[0]:
year[1] +=1
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]
In which year did the most attempts at breaking out of prison with a helicopter occur?
%matplotlib inline
barplot(attempts_per_year)
Comments: The years in which the most helicopter prison break attempts occurred were 1986, 2001, 2007 and 2009, with a total of three attempts each
countries_frequency = df["Country"].value_counts()
print_pretty_table(countries_frequency)
Country | Number of Occurrences |
---|---|
France | 15 |
United States | 8 |
Greece | 4 |
Belgium | 4 |
Canada | 4 |
United Kingdom | 2 |
Australia | 2 |
Brazil | 2 |
Chile | 1 |
Italy | 1 |
Puerto Rico | 1 |
Ireland | 1 |
Mexico | 1 |
Russia | 1 |
Netherlands | 1 |
In which countries do the most attempted helicopter prison escapes occur
import plotly.express as px
fig = px.bar(countries_frequency, x=countries_frequency.index,
y= countries_frequency.values,
title = 'Helicopter prison escape attempts by country (1971 - 2020)',
labels={
'y': 'Number of Ocurrences',
'index': 'Countries'
},
text= countries_frequency.values,
template= 'none')
fig.update_yaxes(showticklabels=False)
fig
Comments: The highest number of helicopter escape attempts were recorded in France (15 attempts), the United states follows with 8 attempts, Greece, Canada and Beligium recorded 4 escape attempts each
Collate records of countries and their success information from 'data'
success_info = []
for row in data:
success_info.append([row[2], row[3]])
print(success_info)
[['Mexico', 'Yes'], ['Ireland', 'Yes'], ['United States', 'No'], ['France', 'Yes'], ['Canada', 'No'], ['Australia', 'No'], ['United States', 'Yes'], ['Brazil', 'Yes'], ['France', 'Yes'], ['United States', 'Yes'], ['Italy', 'Yes'], ['United Kingdom', 'Yes'], ['United States', 'Yes'], ['United States', 'No'], ['United States', 'Yes'], ['Canada', 'Yes'], ['Puerto Rico', 'Yes'], ['France', 'Yes'], ['France', 'No'], ['France', 'No'], ['Chile', 'Yes'], ['Netherlands', 'No'], ['Australia', 'Yes'], ['United States', 'Yes'], ['France', 'Yes'], ['France', 'Yes'], ['France', 'Yes'], ['France', 'No'], ['Brazil', 'Yes'], ['United States', 'Yes'], ['France', 'Yes'], ['France', 'No'], ['France', 'Yes'], ['Greece', 'Yes'], ['Belgium', 'Yes'], ['France', 'Yes'], ['Belgium', 'No'], ['Greece', 'Yes'], ['France', 'Yes'], ['Belgium', 'Yes'], ['United Kingdom', 'No'], ['Russia', 'Yes'], ['Greece', 'No'], ['Canada', 'Yes'], ['Canada', 'Yes'], ['Greece', 'No'], ['France', 'Yes'], ['Belgium', 'No']]
Identifying each unique country using information from the countries frequency table
countries = list(countries_frequency.index)
Computing the success to failure ratio into a new list of lists called 'success chances'
success_chances = []
for country in countries:
ratio = [0,0]
# increment success and failure for every count of yes and no respectivel
for row in success_info:
if row[1].lower() == 'yes' and row[0] == country:
ratio[0] +=1
elif row[1].lower() == 'no' and row[0] == country:
ratio[-1] += 1
# compute the success to failure ratio
if ratio[-1] == 0:
ratio = ratio[0]/1
else:
ratio = ratio[0]/ratio[-1]
success_chances.append([country, ratio])
success_chances
[['France', 2.75], ['United States', 3.0], ['Greece', 1.0], ['Belgium', 1.0], ['Canada', 3.0], ['United Kingdom', 1.0], ['Australia', 1.0], ['Brazil', 2.0], ['Chile', 1.0], ['Italy', 1.0], ['Puerto Rico', 1.0], ['Ireland', 1.0], ['Mexico', 1.0], ['Russia', 1.0], ['Netherlands', 0.0]]
Based on the records compiled for success and failures, In which countries do helicopter prison breaks have a higher chance of success?
barplot(success_chances)
Comments: Canada and US prisoners have the highest chances (3 times success to failure rates) of escaping through helicopters, France (2.75) and Brazil (2.0) closely trail behind
From analysing this dataset, we have been able to observe that:
The highest amout of attempted helicopter prison breaks occured in the years 1986, 2001, 2007 and 2009.
Within the period of 1971 - 2020, France had recorded the highest amount of helicopter attempted prison breaks (15).
Although, France may have recorded the highest amount of helicopter prison break attempts, the chances for actual successes from these attempts are higher in other countries like US and Canada than in France.
Prompts for future exploration: