Ethan C. Campbell, for Central Seattle Greenways / Helmet Law Working Group¶

For questions, contact me at ethanchenbell@gmail.com.

Import packages and set file system¶

In [80]:

%matplotlib inline
from numpy import *
import pandas as pd
pd.set_option('display.max_columns',100)
pd.set_option('display.max_colwidth',50)
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as mtick
import matplotlib.colors as mcolors
import matplotlib.patches as mpatches
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300     # turn on for higher-quality figure export
from datetime import datetime, timedelta
from pytz import timezone
import glob
import platform
import warnings
import sys
import progressbar
import textwrap
import censusdata

from IPython.core.display import display, HTML
# display(HTML("<style>.container { width:100% !important; }</style>"))

# choose root directory for data files
if platform.system() == 'Darwin':
    data_dir = '/Users/Ethan/Documents/Finances and records/2020-06-30 - Helmet Law Working Group/Data/'
    results_dir = '/Users/Ethan/Documents/Finances and records/2020-06-30 - Helmet Law Working Group/Figures/'
elif platform.system() == 'Linux':
    data_dir = '/dat1/ethancc/CSG/'
    results_dir = data_dir
    
# set directory paths
current_results_dir = results_dir + '2022-01-12 - Seattle helmet citation racial disparities by year/'

Load Washington Office of Financial Management (OFM) population estimates for April 2021¶

Data source: WA OFM (see "April 1, 2021 population of cities, towns, and counties used for the allocation of selected state revenues").

In [404]:

ofm = pd.read_excel(data_dir + '2021-04-01 - Washington Office of Financial Management (OFM) population estimates for April 2021.xlsx',
                    sheet_name='King County',index_col=1)

Load compiled bike citation records¶

Note: file created previously in Jupyter notebook *csg_compile_king_county_bike_citations.ipynb*, which is available on my GitHub.

In [2]:

kc_citations = pd.read_excel(data_dir + '2021-10-18 - compiled King County bike citation records.xlsx').drop(columns=['Unnamed: 0'])
kc_citations.head(3)

Out[2]:

	Court Name	Case Number	Case Key (KCDC) or Token (MCs)	Case Type	Case File Date	Law Enforcement Agency	Violation Date	Law Code	Law Description	Disposition	Disposition Date	Defendant Gender	Defendant Race	Defendant Ethnicity	Officer Badge Number	AR Ordered Amount	AR Adjustment Amount	AR Paid Amount	AR Due Amount	Originating Query	City	Violation Datetime	Officer First Name	Officer Middle Name	Officer Last Name	AR Adjusted Amount
0	Black Diamond Municipal Court	2Z0597503	875AEBBFB026042B	Infraction Traffic	2012-09-08	Black Diamond Police Department	2012-09-07	RCW 46.61.755	Violating Laws While Riding Bicycle	Committed	2012-09-26	Male	White	Unknown	1521.0	155.0	0.0	0.0	155.0	All bike violations (requested KCHC 9.10.010 a...	Black Diamond	NaT	Scott	J	Oak	NaN
1	Black Diamond Municipal Court	BD0026522	FCEE4348BC248022	Infraction Non-Traffic	2007-08-14	Black Diamond Police Department	2007-08-13	Black Diamond Municipal Code 8.28.040	No Bicycles on Skate Park	Committed	2007-09-20	Male	White	Unknown	78693.0	75.0	-75.0	0.0	0.0	All bike violations (requested KCHC 9.10.010 a...	Black Diamond	NaT	Timothy	NaN	Macdonald	NaN
2	Black Diamond Municipal Court	BD0029627	613305D233608025	Infraction Traffic	2009-06-22	Black Diamond Police Department	2009-06-19	RCW 46.61.780	Bicycle Defective Equipment	Committed	2009-08-07	Male	White	Unknown	49330.0	155.0	0.0	155.0	0.0	All bike violations (requested KCHC 9.10.010 a...	Black Diamond	NaT	Edward	NaN	Volpone	NaN

Load census demographic data¶

Using US Census Bureau 2015-2019 ACS 5-year estimates, accessed via Python package CensusData. Documentation here: https://jtleider.github.io/censusdata/. For an example of a DP05 table for Seattle, see here.

Note that the grouping of multiracial categories applied here represents a best guess as to how police officers most likely perceive and categorize multiracial subjects:

'Black' includes those listed in 'Two or more races' as 'Black and White' and 'Black and Native American'
'Asian or Pacific Islander' includes those listed in 'Two or more races' as 'Asian and White'
'Native American or Alaskan Native' includes those listed in 'Two or more races' as 'Native American and White'

In [406]:

# load index of Washington state places
places = censusdata.geographies(censusdata.censusgeo([('state','53'),('place','*')]),'acs5',2019)

# extract place names for towns/cities in King County
acs_cities = pd.DataFrame(index=ofm.index).drop('Unincorporated King County')
place_keys = pd.Series(places.keys())
for city_name in acs_cities.index:
    place_name = place_keys[logical_and(place_keys.str.startswith(city_name),
                                        ~place_keys.str.contains('CDP'))].values[0]
    acs_cities.loc[city_name,'ACS place name'] = place_name

In [407]:

# preview available metrics
# censusdata.censustable('acs5',2019,'DP05')
# censusdata.printtable(censusdata.censustable('acs5',2019,'DP05'))

In [408]:

# metrics needed for demographic calculations 
metrics = {'DP05_0033E':'Total population',
           'DP05_0037PE':'Percent/One Race/White',
           'DP05_0038PE':'Percent/One Race/Black',
           'DP05_0039PE':'Percent/One Race/American Indian and Alaska Native',
           'DP05_0044PE':'Percent/One Race/Asian',
           'DP05_0052PE':'Percent/One Race/Native Hawaiian and Other Pacific Islander',
           'DP05_0059PE':'Percent/Two+ Races/White and Black',
           'DP05_0060PE':'Percent/Two+ Races/White and American Indian and Alaska Native',
           'DP05_0061PE':'Percent/Two+ Races/White and Asian',
           'DP05_0062PE':'Percent/Two+ Races/Black and American Indian and Alaska Native',
           'DP05_0071PE':'Percent/Hispanic or Latino (of any race)'}

# download ACS data for each town/city
for city in acs_cities.index:
    place_key = places[acs_cities.loc[city,'ACS place name']]
    place_data = censusdata.download('acs5/profile',2019,place_key,list(metrics.keys()))
    place_data.index = [city]
    for col_name in place_data.keys(): acs_cities.loc[city,col_name] = place_data[col_name].values

# replace metric tokens with descriptive names
acs_cities.rename(columns=metrics,inplace=True)

In [409]:

# aggregate mono-racial and multiracial categories
api_percent = acs_cities['Percent/One Race/Asian'] + acs_cities['Percent/One Race/Native Hawaiian and Other Pacific Islander'] \
              + acs_cities['Percent/Two+ Races/White and Asian']
black_percent = acs_cities['Percent/One Race/Black'] + acs_cities['Percent/Two+ Races/White and Black'] \
                + acs_cities['Percent/Two+ Races/Black and American Indian and Alaska Native']
natam_percent = acs_cities['Percent/One Race/American Indian and Alaska Native'] \
                + acs_cities['Percent/Two+ Races/White and American Indian and Alaska Native']
white_percent = acs_cities['Percent/One Race/White']
white_lat_percent = acs_cities['Percent/Hispanic or Latino (of any race)']
white_nonlat_percent = white_percent - white_lat_percent
other_percent = 100.0 - api_percent - black_percent - natam_percent - white_percent

# save top-level demographic statistics
acs_cities['White'] = white_percent
acs_cities['Black'] = black_percent
acs_cities['Asian or Pacific Islander'] = api_percent
acs_cities['Native American or Alaskan Native'] = natam_percent
acs_cities['Other'] = other_percent
acs_cities['White (Hispanic/Latino)'] = white_lat_percent
acs_cities['White (Non-Hispanic/Latino)'] = white_nonlat_percent

Compute helmet law enforcement statistics by city¶

In [478]:

# to store stats by city
cities = pd.DataFrame(index=ofm.index)
cities['Population (OFM, April 2021)'] = ofm['2021 Population Estimate']

# boolean mask for all helmet citations (including those few that have been issued to guardians of a minor)
helmet_mask = kc_citations['Law Description'].str.contains('Bicycle Helmet Required')

# years enforced (according to available records)
years_printed = kc_citations.loc[helmet_mask].groupby('City')['Violation Date'].agg(lambda x: ', '.join(str(yr) for yr in unique(x.dt.year)))
years_printed = years_printed.str.rstrip(',')
cities['Years with citation records'] = years_printed
cities['Years with citation records'].fillna(value='-',inplace=True)

# total citations in records
cities['Total helmet citations in records'] = kc_citations.loc[helmet_mask].groupby('City')['Case Number'].count()
cities['Total helmet citations in records'].fillna(value=0,inplace=True)
cities['Total helmet citations in records'] = cities['Total helmet citations in records'].astype(int)

# total citations in recent years (2015-2019 only)
def total_recent_citations(dates):
    all_years = dates.dt.year
    recent_years = all_years[logical_and(all_years >= 2015,all_years <= 2019)]
    return len(recent_years)
recent_total = kc_citations.loc[helmet_mask].groupby('City')['Violation Date'].agg(lambda x: total_recent_citations(x))
cities['Recent helmet citations (2015-2019 only)'] = recent_total
cities['Recent helmet citations (2015-2019 only)'].fillna(value=0,inplace=True)
cities['Recent helmet citations (2015-2019 only)'] = cities['Recent helmet citations (2015-2019 only)'].astype(int)

# average annual citations
# NOTE: average only includes years in which citations were issued in a given jurisdiction
num_years_enforced = kc_citations.loc[helmet_mask].groupby('City')['Violation Date'].agg(lambda x: len(unique(x.dt.year)))
cities['Average annual citations (over all years that citations were issued)'] = \
    round(cities['Total helmet citations in records'] / num_years_enforced,1)
cities['Average annual citations (over all years that citations were issued)'].fillna(value=0,inplace=True)

# average annual citations (2015-2019 only)
# NOTE: average assumes available data for all 5 years, and averages over 5 years regardless of whether citations were issued in a given year
def average_recent_years(dates):
    all_years = dates.dt.year
    recent_years = all_years[logical_and(all_years >= 2015,all_years <= 2019)]
    # num_recent_years = len(unique(recent_years))
    num_recent_years = 5
    if num_recent_years == 0: return 0
    else:                     return len(recent_years) / num_recent_years
recent_average = kc_citations.loc[helmet_mask].groupby('City')['Violation Date'].agg(lambda x: average_recent_years(x))
cities['Average annual citations (2015-2019 only)'] = round(recent_average,1)
cities['Average annual citations (2015-2019 only)'].fillna(value=0,inplace=True)

# citation rate per capita
cities['Annual citation rate per 100k people (all years)'] = \
    round(100000 * cities['Average annual citations (over all years that citations were issued)'] / \
          cities['Population (OFM, April 2021)'],1)
cities['Annual citation rate per 100k people (2015-2019 only)'] = \
    round(100000 * cities['Average annual citations (2015-2019 only)'] / cities['Population (OFM, April 2021)'],1)

# which helmet law(s) (local and/or county) are cited?
def cited_county(law_codes):
    return round(100 * (sum(law_codes.str.contains('King County Health Code')) / len(law_codes)),1) 
cities['King County helmet law citations (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Law Code'].agg(lambda x: cited_county(x))
cities['Municipal helmet law citations (%)'] = 100.0 - cities['King County helmet law citations (%)']
cities['King County helmet law citations (%)'].fillna(value='-',inplace=True)
cities['Municipal helmet law citations (%)'].fillna(value='-',inplace=True)

# racial demographics of citations by jurisdiction
def percent_race(all_race,abbrev):
    if abbrev == 'W':   return 100 * sum(all_race == 'White') / len(all_race)
    elif abbrev == 'B': return 100 * sum(all_race == 'Black') / len(all_race)
    elif abbrev == 'A': return 100 * sum(all_race == 'Asian or Pacific Islander') / len(all_race)
    elif abbrev == 'I': return 100 * sum(all_race == 'American Indian or Alaskan Native') / len(all_race)
    elif abbrev == 'U': return 100 * sum(all_race == 'Unknown/Other') / len(all_race)
cities['Citations - White (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Defendant Race'].agg(lambda x: round(percent_race(x,'W'),1))
cities['Citations - Black (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Defendant Race'].agg(lambda x: round(percent_race(x,'B'),1))
cities['Citations - Asian or Pacific Islander (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Defendant Race'].agg(lambda x: round(percent_race(x,'A'),1))
cities['Citations - Native American or Alaskan Native (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Defendant Race'].agg(lambda x: round(percent_race(x,'I'),1))
cities['Citations - Unknown/other (%)'] = kc_citations.loc[helmet_mask].groupby('City')['Defendant Race'].agg(lambda x: round(percent_race(x,'U'),1))
cities['Citations - White (%)'].fillna(value='-',inplace=True)
cities['Citations - Black (%)'].fillna(value='-',inplace=True)
cities['Citations - Asian or Pacific Islander (%)'].fillna(value='-',inplace=True)
cities['Citations - Native American or Alaskan Native (%)'].fillna(value='-',inplace=True)
cities['Citations - Unknown/other (%)'].fillna(value='-',inplace=True)

# add racial demographics from US Census (ACS 5-year survey, 2015-2019)
cities['Census (2015-2019) - White (%)'] = acs_cities['White']
cities['Census (2015-2019) - Black (%)'] = acs_cities['Black']
cities['Census (2015-2019) - Asian or Pacific Islander (%)'] = acs_cities['Asian or Pacific Islander']
cities['Census (2015-2019) - Native American or Alaskan Native (%)'] = acs_cities['Native American or Alaskan Native']
cities['Census (2015-2019) - Other (%)'] = acs_cities['Other']

# fines ordered, assessed, and paid for citations
kc_citations['AR Adjusted Amount'].fillna(value=kc_citations['AR Ordered Amount'] + \
                                                kc_citations['AR Adjustment Amount'],inplace=True)
paid_fraction = (kc_citations['AR Paid Amount'] / kc_citations['AR Adjusted Amount'])
kc_citations['AR Unpaid'] = (paid_fraction == 0.0)
non_dup_rows = ~kc_citations['Case Number'].duplicated(keep=False)
helmet_non_dup_mask = logical_and(helmet_mask,non_dup_rows)
helmet_ordered_grouped = kc_citations.loc[helmet_non_dup_mask].groupby('City')['AR Ordered Amount']
cities['Fines charged - number of data points'] \
    = helmet_ordered_grouped.count().astype(int)
cities['Fines charged - average ($)'] \
    = round(helmet_ordered_grouped.mean().dropna(),2)
cities['Fines charged - standard deviation ($)'] \
    = round(helmet_ordered_grouped.std().dropna(),2)
cities['Fines charged - median ($)'] \
    = round(helmet_ordered_grouped.median().dropna(),2)
cities['Fines charged - IQR25% ($)'] \
    = round(helmet_ordered_grouped.quantile(q=0.25).dropna(),2)
cities['Fines charged - IQR75% ($)'] \
    = round(helmet_ordered_grouped.quantile(q=0.75).dropna(),2)
cities['Fines charged - most common value (mode) ($)'] \
    = round(helmet_ordered_grouped.apply(pd.Series.mode).groupby('City').max(),2)
helmet_adjusted_grouped = kc_citations.loc[helmet_non_dup_mask].groupby('City')['AR Adjusted Amount']
cities['Fines assessed after adjustments - number of data points'] \
    = helmet_adjusted_grouped.count().astype(int)
cities['Fines assessed after adjustments - average ($)'] \
    = round(helmet_adjusted_grouped.mean().dropna(),2)
cities['Fines assessed after adjustments - standard deviation ($)'] \
    = round(helmet_adjusted_grouped.std().dropna(),2)
cities['Fines assessed after adjustments - median ($)'] \
    = round(helmet_adjusted_grouped.median().dropna(),2)
cities['Fines assessed after adjustments - IQR25% ($)'] \
    = round(helmet_adjusted_grouped.quantile(q=0.25).dropna(),2)
cities['Fines assessed after adjustments - IQR75% ($)'] \
    = round(helmet_adjusted_grouped.quantile(q=0.75).dropna(),2)
cities['Fines assessed after adjustments - most common value (mode) ($)'] \
    = round(helmet_adjusted_grouped.apply(pd.Series.mode).groupby('City').max(),2)
unpaid_grouped = kc_citations.loc[helmet_non_dup_mask].groupby('City')['AR Unpaid']
cities['Fraction of citations unpaid'] = round(100 * unpaid_grouped.sum() / helmet_adjusted_grouped.count(),1)

In [479]:

# export compiled statistics by city
cities.to_excel(data_dir + '2021-10-18 - King County helmet citation statistics.xlsx')

In [480]:

display(cities)

	Population (OFM, April 2021)	Years with citation records	Total helmet citations in records	Recent helmet citations (2015-2019 only)	Average annual citations (over all years that citations were issued)	Average annual citations (2015-2019 only)	Annual citation rate per 100k people (all years)	Annual citation rate per 100k people (2015-2019 only)	King County helmet law citations (%)	Municipal helmet law citations (%)	Citations - White (%)	Citations - Black (%)	Citations - Asian or Pacific Islander (%)	Citations - Native American or Alaskan Native (%)	Citations - Unknown/other (%)	Census (2015-2019) - White (%)	Census (2015-2019) - Black (%)	Census (2015-2019) - Asian or Pacific Islander (%)	Census (2015-2019) - Native American or Alaskan Native (%)	Census (2015-2019) - Other (%)	Fines charged - number of data points	Fines charged - average ($)	Fines charged - standard deviation ($)	Fines charged - median ($)	Fines charged - IQR25% ($)	Fines charged - IQR75% ($)	Fines charged - most common value (mode) ($)	Fines assessed after adjustments - number of data points	Fines assessed after adjustments - average ($)	Fines assessed after adjustments - standard deviation ($)	Fines assessed after adjustments - median ($)	Fines assessed after adjustments - IQR25% ($)	Fines assessed after adjustments - IQR75% ($)	Fines assessed after adjustments - most common value (mode) ($)	Fraction of citations unpaid
Jurisdiction
Unincorporated King County	251220	2005, 2006, 2007, 2010, 2014, 2015, 2016, 2017...	40	26	4.4	5.2	1.8	2.1	97.5	2.5	85	12.5	0	0	2.5	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Algona	3265	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	55.8	7.7	22.1	2.6	11.8	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Auburn	73900	2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012...	244	146	14.4	29.2	19.5	39.5	0.8	99.2	74.2	13.5	4.9	2.9	4.5	62.3	8.0	16.0	3.6	10.1	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Beaux Arts Village	300	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	92.6	0.0	6.4	0.0	1.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Bellevue	149900	2005, 2006, 2007, 2008, 2012, 2013, 2015, 2016...	23	9	2.1	1.8	1.4	1.2	0	100	65.2	17.4	4.3	0	13	54.5	3.1	38.6	0.8	3.0	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Black Diamond	5990	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	92.6	0.6	4.4	0.8	1.6	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Bothell	30000	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	69.3	3.5	20.3	0.9	6.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Burien	53290	2009, 2016, 2017, 2018	8	7	2.0	1.4	3.8	2.6	37.5	62.5	75	0	0	0	25	54.8	9.7	16.2	2.4	16.9	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Carnation	2285	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	93.2	0.7	5.2	0.0	0.9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Clyde Hill	3055	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	67.1	1.1	29.8	0.7	1.3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Covington	20890	2011, 2012, 2015	6	2	2.0	0.4	9.6	1.9	100	0	33.3	33.3	0	0	33.3	71.1	7.6	15.3	0.9	5.1	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Des Moines	32820	2005, 2011, 2012, 2013, 2018	6	1	1.2	0.2	3.7	0.6	16.7	83.3	50	33.3	0	0	16.7	56.9	8.5	16.9	2.5	15.2	3.0	171.67	21.94	179.0	163.0	184.0	189.0	3.0	63.00	109.12	0.0	0.00	94.50	0.0	33.3
Duvall	8090	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	87.7	1.7	6.1	0.0	4.5	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Enumclaw	13030	2000, 2003, 2008	3	0	1.0	0.0	7.7	0.0	0	100	100	0	0	0	0	87.0	1.6	3.1	1.0	7.3	1.0	288.00	NaN	288.0	288.0	288.0	288.0	1.0	0.00	NaN	0.0	0.00	0.00	0.0	0.0
Federal Way	99590	2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016...	127	56	10.6	11.2	10.6	11.2	0	100	59.8	21.3	2.4	0.8	15.7	51.5	16.9	16.9	1.6	13.1	104.0	176.68	34.70	186.0	186.0	186.0	186.0	104.0	161.83	61.11	186.0	149.75	186.00	186.0	88.5
Hunts Point	425	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	84.1	0.0	15.4	0.5	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Issaquah	39840	2009, 2011, 2014, 2017	5	1	1.2	0.2	3.0	0.5	0	100	100	0	0	0	0	65.7	2.3	26.3	0.5	5.2	5.0	45.00	11.18	50.0	50.0	50.0	50.0	5.0	30.00	27.39	50.0	0.00	50.00	50.0	60.0
Kenmore	23770	2007, 2009, 2010, 2011, 2012, 2013, 2014, 2015...	88	57	8.0	11.4	33.7	48.0	79.5	20.5	86.4	4.5	3.4	0	3.4	77.7	1.6	16.7	0.8	3.2	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Kent	132400	2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010...	642	133	35.7	26.6	27.0	20.1	0	100	80.1	14.2	1.6	0.8	3.4	46.9	14.5	24.7	1.6	12.3	583.0	131.40	35.80	141.0	133.0	141.0	141.0	583.0	67.70	74.52	0.0	0.00	141.00	0.0	44.4
Kirkland	92110	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	75.7	2.0	17.2	1.0	4.1	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Lake Forest Park	13370	2009, 2016, 2017, 2019, 2020	12	9	2.4	1.8	18.0	13.5	0	100	58.3	8.3	25	0	8.3	81.6	2.3	11.1	1.0	4.0	8.0	87.38	34.10	72.5	60.0	112.0	60.0	8.0	79.25	22.97	72.5	60.00	91.75	60.0	37.5
Maple Valley	27570	2004, 2006, 2007, 2010, 2012, 2013, 2014, 2015...	36	26	2.8	5.2	10.2	18.9	2.8	97.2	91.7	2.8	0	0	5.6	82.3	2.0	10.0	2.1	3.6	18.0	89.94	53.91	82.0	82.0	86.5	82.0	18.0	84.56	60.61	82.0	81.37	82.00	82.0	72.2
Medina	3335	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	70.0	1.1	26.3	0.9	1.7	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Mercer Island	24990	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	74.2	1.3	22.6	0.3	1.6	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Milton	1615	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	86.5	2.3	7.4	0.9	2.9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Newcastle	13410	2018, 2019	7	7	3.5	1.4	26.1	10.4	85.7	14.3	0	0	0	0	100	57.6	4.1	35.7	0.6	2.0	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Normandy Park	6740	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	86.9	0.6	8.8	0.3	3.4	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
North Bend	7700	2006, 2010, 2011, 2012, 2016	6	1	1.2	0.2	15.6	2.6	100	0	100	0	0	0	0	84.6	1.5	4.6	0.0	9.3	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Pacific	6960	2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017...	67	24	6.7	4.8	96.3	69.0	0	100	91	3	1.5	0	4.5	57.6	6.6	22.1	3.2	10.5	67.0	74.87	28.74	55.0	52.0	113.0	113.0	67.0	73.22	31.32	55.0	52.00	113.00	113.0	88.1
Redmond	71180	2018	1	1	1.0	0.2	1.4	0.3	100	0	100	0	0	0	0	55.0	2.7	38.4	0.6	3.3	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Renton	106500	2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014...	79	21	6.1	4.2	5.7	3.9	0	100	79.7	6.3	0	0	13.9	50.1	11.0	27.2	1.6	10.1	67.0	75.34	42.20	103.0	25.0	104.0	25.0	67.0	69.58	49.76	103.0	25.00	104.00	103.0	73.1
Sammamish	66130	2016, 2021	2	1	1.0	0.2	1.5	0.3	100	0	50	0	0	0	50	62.0	1.9	34.4	0.5	1.2	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
SeaTac	29890	2016, 2017	3	3	1.5	0.6	5.0	2.0	0	100	0	0	0	0	100	38.2	26.8	20.6	1.7	12.7	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Seattle	769500	2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010...	1694	215	94.1	43.0	12.2	5.6	99.9	0.1	72.8	17.4	1.6	1	7.1	67.3	8.6	18.9	1.3	3.9	176.0	103.14	18.33	102.0	102.0	102.0	102.0	176.0	88.18	38.14	102.0	102.00	102.00	102.0	76.7
Shoreline	57860	2005, 2006, 2011, 2012, 2013, 2014, 2018, 2020	11	1	1.4	0.2	2.4	0.3	90.9	9.1	81.8	9.1	0	0	9.1	69.1	7.2	18.7	1.6	3.4	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Skykomish	210	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	89.3	10.7	0.0	0.0	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Snoqualmie	14370	2020	1	0	1.0	0.0	7.0	0.0	0	100	100	0	0	0	0	82.3	1.5	13.6	1.0	1.6	1.0	87.00	NaN	87.0	87.0	87.0	87.0	1.0	87.00	NaN	87.0	87.00	87.00	87.0	0.0
Tukwila	21970	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	36.1	19.9	29.1	2.0	12.9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Woodinville	12800	2010, 2011, 2015	3	1	1.0	0.2	7.8	1.6	100	0	100	0	0	0	0	82.5	1.6	12.4	1.4	2.1	0.0	NaN	NaN	NaN	NaN	NaN	NaN	0.0	NaN	NaN	NaN	NaN	NaN	NaN	NaN
Yarrow Point	1030	-	0	0	0.0	0.0	0.0	0.0	-	-	-	-	-	-	-	76.3	0.0	23.0	0.0	0.7	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

Summary plots¶

In [481]:

# recent average annual citations by city
city_contributions = cities['Average annual citations (2015-2019 only)'].sort_values().copy()
city_contributions = city_contributions[city_contributions > 0]
labels = [name if 100 * city_contributions[name] / city_contributions.sum() > 0.8 else '' for name in city_contributions.index]
def autopct(percent): return ('%.1f%%' % percent) if percent > 0.8 else ''

plt.figure(figsize=(7,5),facecolor='w')
with plt.style.context({"axes.prop_cycle":plt.cycler('color',roll(plt.cm.tab20c.colors[2:],4,axis=0)**0.8)}):
    wedges, labels, autopct = plt.pie(city_contributions,explode=tile(0.02,len(city_contributions)),
                                      labels=labels,labeldistance=1.10,autopct=autopct,pctdistance=0.75)
for lab in labels: lab.set_fontsize(9)
for pct in autopct: pct.set_fontsize(7)
plt.title('Contributions to total King County helmet citations\nby jurisdiction (2015-2019)');
plt.savefig(current_results_dir + 'king_co_helmet_citations_by_jurisdiction.pdf')

In [482]:

# helmet vs. non-helmet citations in King County outside of Seattle
kc_subset = kc_citations[logical_and(kc_citations['City'] != 'Seattle',kc_citations['Violation Date'].dt.year >= 2003)]
helmet_mask = kc_subset['Law Description'].str.contains('Bicycle Helmet Required')
helmet_percent = 100 * sum(helmet_mask) / len(helmet_mask)
nonhelmet_percent = 100 - helmet_percent
def autopct(percent): return ('%.1f%%' % percent)

plt.figure(figsize=(7,5),facecolor='w')
wedges, labels, autopct = plt.pie([helmet_percent,nonhelmet_percent],explode=tile(0.01,2),startangle=45,
                                  labels=['Helmet\ncitations','Non-helmet\ncitations'],
                                  colors=['steelblue','navy'],
                                  labeldistance=1.10,autopct=autopct,pctdistance=0.75)
autopct[1].set_color('w')
for lab in labels: lab.set_fontsize(11)
for pct in autopct: pct.set_fontsize(10)
plt.title('Citations issued to bicyclists by law enforcement\nin King County excluding Seattle (2003-present)');
plt.text(0.5,0.84,'$n$ = {0} citations'.format(len(helmet_mask)),fontsize=10,
         horizontalalignment='center',transform=plt.gcf().transFigure);
plt.savefig(current_results_dir + 'king_co_bike_citations_type.pdf')

In [483]:

# years of available data
plt.figure(figsize=(7,5),facecolor='w')
plt.hist(kc_citations['Violation Date'].dt.year,bins=arange(1999.5,2022.5),color='k',rwidth=0.85,zorder=1,
         label='All bike-related citations')
plt.hist(kc_citations.loc[kc_citations['Law Description'].str.contains('Bicycle Helmet Required'),'Violation Date'].dt.year,
         bins=arange(1999.5,2022.5),color='0.7',rwidth=0.65,zorder=2,label='Helmet citations only')
plt.legend(loc='upper left',frameon=False)
plt.text(0.5,0.9,'Available bike citation records for King County (including Seattle)',size=14,
         horizontalalignment='center',transform=plt.gcf().transFigure)
plt.title('(Note: data availability is affected by both citation rates and data retention practices)',
          style='italic',size=8,y=1.025)
plt.ylabel('Count');
plt.tight_layout(rect=[0,0.03,1,0.90])
plt.savefig(current_results_dir + 'king_co_bike_citations_by_year.pdf')

In [484]:

# per-capita helmet citation rates by jurisdiction
cities_subset = cities[['Total helmet citations in records',
                        'Annual citation rate per 100k people (all years)',
                        'Annual citation rate per 100k people (2015-2019 only)']].copy()
cities_without_enforcement = cities_subset[cities_subset['Total helmet citations in records'] == 0].index.values
cities_subset = cities_subset[cities_subset['Total helmet citations in records'] != 0]
cities_subset.sort_values(by='Annual citation rate per 100k people (all years)',inplace=True)

plt.figure(figsize=(11,7.5),facecolor='w')
plt.scatter(cities_subset['Annual citation rate per 100k people (all years)'],range(len(cities_subset)),
            marker='o',s=75,c='k',zorder=3,alpha=0.8,label='All years with citation records')
plt.scatter(cities_subset['Annual citation rate per 100k people (2015-2019 only)'],range(len(cities_subset)),
            marker='x',s=50,c='k',zorder=3,alpha=0.6,label='2015-2019 only')
plt.legend(bbox_to_anchor=(1.0,0.88),loc='upper right',frameon=True,labelspacing=1.0)
plt.yticks(ticks=range(len(cities_subset)),
           labels=[city + ' ($n$ = {0})'.format(cities_subset.loc[city,'Total helmet citations in records']) \
                   for city in cities_subset.index])
plt.xlim([-5,None])
plt.ylim([-0.5,len(cities_subset)-0.5])
plt.grid(alpha=0.5,zorder=1)
plt.gca().spines['left'].set_position(('outward',5))
plt.gca().spines['top'].set_position(('outward',5))
plt.gca().spines['left'].set_visible(False)
plt.gca().spines['bottom'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().tick_params(axis='y',length=0)
plt.gca().xaxis.set_ticks_position("top")
plt.gca().xaxis.set_label_position("top")
plt.xticks(ticks=arange(0,100,10))
plt.xlabel('Annual citation rate per 100k people (2021 WA OFM population estimates)',labelpad=10)
plt.title('Per capita rates of helmet tickets issued in King County jurisdictions             ',size=14,y=1.10)
cities_without_enforcement[2] = '\n' + cities_without_enforcement[2]
cities_without_enforcement[10] = '\n' + cities_without_enforcement[10]
plt.text(0.4,0.085,'$^{*}$ ' + 'Cities/towns with no records of bicycle helmet enforcement: {0}.'\
         .format(', '.join(cities_without_enforcement).rstrip(',')),
         size=9,transform=plt.gcf().transFigure,style='italic')
plt.tight_layout()
plt.savefig(current_results_dir + 'king_co_helmet_citation_rates_by_jurisdiction.pdf')

In [485]:

# helmet citation disparities for Black defendants
cities_subset = cities[['Total helmet citations in records',
                        'Citations - White (%)','Census (2015-2019) - White (%)',
                        'Citations - Black (%)','Census (2015-2019) - Black (%)']].copy().drop('Unincorporated King County')
cities_subset = cities_subset[cities_subset['Total helmet citations in records'] != 0]
cities_subset['Disparity in citations issued to Black cyclists'] \
    = (cities_subset['Citations - Black (%)'] / cities_subset['Census (2015-2019) - Black (%)'])
disparities = cities_subset['Disparity in citations issued to Black cyclists']\
    [cities_subset['Total helmet citations in records'] > 10]
disparities.sort_values(inplace=True)

plt.figure(figsize=(7.5,6.5),facecolor='w')
plt.scatter(cities_subset['Census (2015-2019) - Black (%)'][disparities.index],range(len(disparities)),
            marker='o',s=75,c='k',zorder=3,label='Percent of local population\n(U.S. Census ACS, 2015-2019)')
plt.scatter(cities_subset['Citations - Black (%)'][disparities.index],range(len(disparities)),
            marker='d',edgecolors='k',s=75,zorder=3,
            c=disparities,cmap='RdBu_r',norm=mcolors.LogNorm(vmin=1.35e-1,vmax=0.7e1))
plt.scatter(NaN,NaN,marker='d',c='k',edgecolors='k',s=75,label='Percent of helmet citations')
for c_idx, city in enumerate(disparities.index):
    if cities_subset.loc[city,'Census (2015-2019) - Black (%)'] > cities_subset.loc[city,'Citations - Black (%)']:
        d = 0.1
    else:
        d = -0.1
    plt.annotate(text=None,xy=(cities_subset.loc[city,'Census (2015-2019) - Black (%)'],c_idx),
                 xytext=(cities_subset.loc[city,'Citations - Black (%)']+d,c_idx),
                 arrowprops=dict(arrowstyle='<-, head_width=0.3',linewidth=1.5))
    plt.text(0.5*(cities_subset.loc[city,'Census (2015-2019) - Black (%)'] + 
                  cities_subset.loc[city,'Citations - Black (%)']),c_idx + 0.23,
             '{0:.1f}x '.format(disparities.loc[city]),weight='bold',horizontalalignment='center')
plt.legend(bbox_to_anchor=(0.05,1.20),loc='upper left',ncol=2,handletextpad=0.4,frameon=False)
plt.yticks(ticks=range(len(disparities)),
           labels=[city + '\n($n$ = {0})'.format(cities_subset.loc[city,'Total helmet citations in records']) \
                   for city in disparities.index])
plt.xlim([0,None])
plt.ylim([-0.5,len(disparities)-0.25])
plt.grid(alpha=0.5,zorder=1)
plt.gca().spines['left'].set_position(('outward',5))
plt.gca().spines['top'].set_position(('outward',5))
plt.gca().spines['left'].set_visible(False)
plt.gca().spines['bottom'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().tick_params(axis='y',length=0)
plt.gca().xaxis.set_ticks_position("top")
plt.gca().xaxis.set_label_position("top")
plt.xticks(ticks=arange(0,25,5))
plt.gca().xaxis.set_major_formatter(mtick.PercentFormatter(decimals=0))
# plt.colorbar()
plt.title('Disparities in helmet citations issued to Black bicyclists         ',size=14,y=1.2)
plt.text(0.09,0.055,'$^{*}$ Note: only cities/towns with at least 10 helmet citation records are shown.',
         size=9,transform=plt.gcf().transFigure,style='italic');
plt.tight_layout()
plt.savefig(current_results_dir + 'king_co_helmet_citation_disparities.pdf')

In [496]:

# fines ordered, assessed, and paid for citations
fig = plt.figure(figsize=(7.5,6.5),facecolor='w')
cities_with_data = (cities['Fines charged - number of data points'] >= 5)
cities_with_data = list(cities_with_data[cities_with_data].index)
helmet_mask = kc_citations['Law Description'].str.contains('Bicycle Helmet Required')
non_dup_rows = ~kc_citations['Case Number'].duplicated(keep=False)
helmet_non_dup_mask = logical_and(helmet_mask,non_dup_rows)
helmet_ordered_grouped = kc_citations.loc[helmet_non_dup_mask].groupby('City')['AR Ordered Amount']
helmet_adjusted_grouped = kc_citations.loc[helmet_non_dup_mask].groupby('City')['AR Adjusted Amount']

gs = fig.add_gridspec(3,1)
ax1 = fig.add_subplot(gs[0:2])
x_off = 0.15
labels = []
def add_label(violin,label):
    color = violin['bodies'][0].get_facecolor().flatten()
    labels.append((mpatches.Patch(color=color),label))
for c_idx, city in enumerate(cities_with_data):
    med = cities.loc[city,'Fines charged - median ($)']
    vp1 = plt.gca().violinplot(helmet_ordered_grouped.get_group(city).dropna(),positions=[c_idx-x_off],
                               vert=True,widths=0.25,showmeans=True,showextrema=False,
                               showmedians=False,quantiles=None)
    for vp_element in vp1['bodies']:
        vp_element.set_color('darkolivegreen')
        vp_element.set_alpha(0.5)
        vp_element.set_zorder(2)
    vp1['cmeans'].set_color('darkolivegreen')
    vp1['cmeans'].set_zorder(3)
    if c_idx == 0: add_label(vp1,'Fines charged')
    charged_mean = cities.loc[city,'Fines charged - average ($)']
    plt.text(c_idx-x_off*1.5,charged_mean,f'${int(round(charged_mean,0))}',fontsize=7,
             horizontalalignment='right',verticalalignment='center')
        
    vp2 = plt.gca().violinplot(helmet_adjusted_grouped.get_group(city).dropna(),positions=[c_idx+x_off],
                               vert=True,widths=0.25,showmeans=True,showextrema=False,
                               showmedians=False,quantiles=None)
    for vp_element in vp2['bodies']:
        vp_element.set_color('steelblue')
        vp_element.set_alpha(0.5)
        vp_element.set_zorder(2)
    vp2['cmeans'].set_color('steelblue')
    vp2['cmeans'].set_zorder(3)
    if c_idx == 0: add_label(vp2,'Fines assessed after adjustments')
    assessed_mean = cities.loc[city,'Fines assessed after adjustments - average ($)']
    plt.text(c_idx+x_off*1.5,assessed_mean,f'${int(round(assessed_mean,0))}',fontsize=7,
             horizontalalignment='left',verticalalignment='center',zorder=3)
xticklabels = [' '] + ['\n{0}\n($n$ = {1})'.format(city,int(cities.loc[city,'Fines charged - number of data points'])) \
                       for city in cities_with_data]
for l_idx, label in enumerate(xticklabels):
    if 'Seattle' in label:
        xticklabels[l_idx] = xticklabels[l_idx].replace('Seattle','Seattle**')
    elif 'Lake Forest Park' in label:
        xticklabels[l_idx] = xticklabels[l_idx].replace('Lake Forest Park\n(','Lake Forest\nPark\n(')
plt.gca().set_xticklabels(xticklabels)

plt.xticks(fontsize=8)
plt.yticks(fontsize=8)
labels.append((plt.plot(NaN,NaN,c='k')[0],'Average value'))
plt.legend(*zip(*labels),ncol=3,frameon=False,fontsize=8,
           bbox_to_anchor=(0,1.1),loc='upper left')
plt.ylim([0,260])
plt.xlim([-0.6,len(cities_with_data)-1+0.6])
plt.grid(alpha=0.5,lw=0.5,zorder=1)
plt.gca().yaxis.set_major_formatter('${x:1.0f}')
plt.ylabel('Helmet violation fine amount')
plt.text(0.53,0.97,'Helmet law fines charged/assessed and fraction unpaid',size=14,
         horizontalalignment='center',transform=plt.gcf().transFigure)
plt.title('(Note: fines charged include base violation and court fees, including late penalties; adjustments\n' \
          + 'may include court-issued reductions or waivers, such as for proof of helmet purchase)',
          style='italic',size=8,y=1.10)

ax2 = fig.add_subplot(gs[2])
plt.bar(x=range(len(cities_with_data)),
        height=cities['Fraction of citations unpaid'][cities_with_data],
        width=0.25,color='0.5',edgecolor='0.3',alpha=0.6,zorder=3,
        label='Percent of helmet citations unpaid')
for c_idx, city in enumerate(cities_with_data):
    frac_unpaid = cities['Fraction of citations unpaid'][city]
    plt.text(c_idx,frac_unpaid,f'{int(round(frac_unpaid,0))}%',fontsize=7,
             horizontalalignment='center',verticalalignment='bottom',zorder=3)
plt.xlim([-0.6,len(cities_with_data)-1+0.6])
plt.ylim([0,100])
plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter())
plt.tick_params(axis='x',which='both',top=True,bottom=False,labelbottom=False)
plt.tick_params(axis='both',labelsize=8)
plt.grid(alpha=0.5,lw=0.5,zorder=1)
plt.legend(fontsize=8,frameon=False,bbox_to_anchor=(0,-0.02),loc='upper left')
plt.ylabel('Citations unpaid')

note_text = '$^{*}$ Note: only cities/towns with at least 5 helmet citation financial records are shown, and only helmet citations issued as the sole\n            violation are included due to the inability to disaggregate multiple charges issued together. Instances of \$0 in fines\n            assessed after adjustments are considered paid citations in the bottom panel.\n$^{**}$ Fine records for Seattle shown here do not include late/default penalties, commonly \$52 when a defendant fails to respond\n            within 19 days of receiving a citation.'
plt.text(0.10,0.004,note_text,
         size=7,transform=plt.gcf().transFigure,style='italic');

plt.tight_layout()
plt.savefig(current_results_dir + 'king_co_helmet_citation_fines.pdf')

/Applications/anaconda/lib/python3.6/site-packages/ipykernel_launcher.py:54: UserWarning: FixedFormatter should only be used together with FixedLocator

In [111]:

# helmet citation records per year in Seattle
seattle_helmet_mask = logical_and(kc_citations['City'] == 'Seattle', kc_citations['Law Description'] == 'Bicycle Helmet Required')
seattle_helmet_citations = kc_citations[seattle_helmet_mask]
seattle_helmet_citations_per_year = seattle_helmet_citations.groupby(seattle_helmet_citations['Violation Date'].dt.year).count().max(axis=1)

# Seattle cyclist demographics, from notebook 'csg_bike_infraction_analysis.ipynb'
#     stats copied from displayed chart: 'Seattle cyclist demographics (frequency-weighted using SDOT/EMC results) inferred from PRR survey'
percent_white_nonhispanic_cyclists = 70.726539
percent_white_hispanic_cyclists = 5.166718
percent_white_cyclists = percent_white_nonhispanic_cyclists + percent_white_hispanic_cyclists
percent_black_cyclists = 4.714178

# helmet citation racial disparities over time in Seattle
def disparity_function(input_dataframe):
    citation_count_total = input_dataframe['Defendant Race'].value_counts().sum()
    percent_white_citations = 100 * sum(input_dataframe['Defendant Race'] == 'White') / citation_count_total
    percent_black_citations = 100 * sum(input_dataframe['Defendant Race'] == 'Black') / citation_count_total
    # percent_white_census = cities.loc['Seattle']['Census (2015-2019) - White (%)']
    # percent_black_census = cities.loc['Seattle']['Census (2015-2019) - Black (%)']
    black_white_disparity = (percent_black_citations / percent_black_cyclists) / (percent_white_citations / percent_white_cyclists)
    return black_white_disparity
bw_disparity_by_year = seattle_helmet_citations.groupby(seattle_helmet_citations['Violation Date'].dt.year).apply(disparity_function)
average_bw_disparity = disparity_function(seattle_helmet_citations)

# plot
plt.figure(figsize=(7.5,3.5),facecolor='w')
plt.plot(bw_disparity_by_year,c='k',zorder=2)
plt.scatter(bw_disparity_by_year.index,bw_disparity_by_year.values,s=seattle_helmet_citations_per_year.values,c='k',zorder=2)
leg1 = plt.scatter(NaN,NaN,s=25,c='k',label='10 tickets')
leg2 = plt.scatter(NaN,NaN,s=50,c='k',label='50 tickets')
leg3 = plt.scatter(NaN,NaN,s=100,c='k',label='200 tickets per year')
plt.plot([2002,2021],[average_bw_disparity,average_bw_disparity],c='b',ls='--',zorder=1)
plt.text(2002.5,2.75,'Average ({0:.1f}x disparity)'.format(average_bw_disparity),color='b')
plt.xticks(arange(2003,2021,2))
plt.yticks(arange(0,16,3))
plt.xlim([2002,2021])
plt.xlabel('Year')
plt.ylabel('Rate of citations to Black cyclists\ncompared to white cyclists')
plt.title('Black-white disparities in Seattle helmet citation records')
plt.grid(lw=0.5,alpha=0.5,zorder=1)
plt.legend(loc='upper right',ncol=3,frameon=False)
plt.tight_layout()
plt.savefig(current_results_dir + 'seattle_helmet_citation_disparities_by_year.pdf')

# compare 2008-2010 with 2018-2020 average disparities (weighted by # tickets in records)
average_0 = (bw_disparity_by_year[2008] * seattle_helmet_citations_per_year[2008] + \
             bw_disparity_by_year[2009] * seattle_helmet_citations_per_year[2009] + \
             bw_disparity_by_year[2010] * seattle_helmet_citations_per_year[2010]) / \
            (seattle_helmet_citations_per_year.loc[2008:2010].sum())
print('2008-2010 average disparity, weighted by # annual recorded tickets: {0:.2f}x'.format(average_0))
average_1 = (bw_disparity_by_year[2018] * seattle_helmet_citations_per_year[2018] + \
             bw_disparity_by_year[2019] * seattle_helmet_citations_per_year[2019] + \
             bw_disparity_by_year[2020] * seattle_helmet_citations_per_year[2020]) / \
            (seattle_helmet_citations_per_year.loc[2018:2020].sum())
print('2018-2020 average disparity, weighted by # annual recorded tickets: {0:.2f}x'.format(average_1))

2008-2010 average disparity, weighted by # annual recorded tickets: 3.52x
2018-2020 average disparity, weighted by # annual recorded tickets: 3.95x

Analysis of King County bike-related citation records¶