The Biden-Harris Administration was the first admin to publish White House visitor logs from their first year in office. It contains data on every public visitor to the Executive Branch over the year 2021. This exciting dataset can help us get a pulse on the nation's concerns.For example, one of President Biden's promises was to prioritize climate change. He placed National Climate Advisors throughout the Executive Branch. Will those advisors prove to be one of the nation's most frequently visited people? This dataset will answer that question as well as provide other interesting details, such as whether the President, First Lady, or the Vice President received more visitors during their first yeaar in office.
Are you someone who might like to plan a future visit to the Capitol? Precise time logs in this dataset allow us to determine the route which provides the shortest wait times between booking an appointment date and getting that appointment. We can also see which building is associated with the shortest wait times and which received the most total visitors.
This notebook uses a merged file of ten csv files downloaded from https://www.whitehouse.gov/disclosures/visitor-logs/. The original data files on the White House website are arranged by month and give the total number of visitor entries per month. Can you guess which month received the most entries? That was July, 2021, which had 5,617 visitor log entries! Please also note that as of February, 2022, not all of the data from 2021 is available. We have data from January to October, 2021. This notebook will be updated when more data from 2021 becomes publicly available, and this last sentence will be changed to reflect that.
Now let's dig into this dataset! First we will import the csv file. CSV is short for "comma separated values" and represents how the data is stored. It's a little chunk of code to read in the file:
from csv import reader
opened_file = open('combined.csv')
read_file = reader(opened_file)
whdf = list(read_file)
import pandas as pd
import numpy as np
whdf = pd.read_csv('combined.csv')
from csv import *
whdf.dtypes
whdf.shape
(21334, 28)
This shows that our dataframe has 21,334 rows and 28 columns. Now let's look at what those column headers are. Below we will see the column header on the left side and the first entry of data down the right side.
columns = whdf.iloc[0,:]
print(columns)
NAMELAST AIYER NAMEFIRST KRISHNAN NAMEMID V UIN U20190 BDGNBR 135620 ACCESS_TYPE VA TOA 1/26/2021 19:58 POA B0401 TOD NaN POD NaN APPT_MADE_DATE 1/26/2021 19:51 APPT_START_DATE 1/26/2021 20:00 APPT_END_DATE 1/26/2021 23:59 APPT_CANCEL_DATE NaN TOTAL_PEOPLE 1 LAST_UPDATEDBY AL POST WIN LASTENTRYDATE 1/26/2021 19:51 TERMINAL_SUFFIX AL VISITEE_NAMELAST McCarthy VISITEE_NAMEFIRST John MEETING_LOC WH MEETING_ROOM EW 121 CALLER_NAME_LAST LIN CALLER_NAME_FIRST AUSTIN CALLER_ROOM NaN DESCRIPTION NaN RELEASEDATE 7-May-21 Name: 0, dtype: object
Here is a brief description of the column headers which are relevant to this data exploration:
meeting_location = whdf.iloc[:,21]
print(meeting_location.unique())
['WH' 'NEOB' 'EEOB' 'MEETING_LOC' 'OEOB' 'VPR' nan]
This little code snippet above referenced the MEETING_LOC column header by index number 21 and then extracted all of the unique values from that column. As you can see from the result, there are 7 unique values, include nan (Not A Number) which indicates 'no value' or a blank/missing field in Python Pandas library. It also includes the header name once. A little snooping on Google provides an explanation for the other abbreviations:
The Old Executive Office Building and the Eisenhower Executive Office Building are actually two names for the same building. Looking in Google maps, here are the relative locations for the White House, the Eisenhower Exeuctive Office Building, and the New Executive Office Building:
Because the Eisenhower Executive Office Building is located closer to the White House than the New Executive Office Building, I predict that there will be a longer wait time to get an appointment because of added security. This is just a guess. Running a bit of code next will tell us whether I am right or not. We can determine an aggregate turnaround time for each building by subtracting the time between the APPT_MADE_DATE column and the APPT_START_DATE column. The truncated table below shows the portion of the table which shows the relevant columns.
import datetime as dt
time_per_building = whdf.iloc[:,[10,11,14,21]]
print(time_per_building)
APPT_MADE_DATE APPT_START_DATE TOTAL_PEOPLE MEETING_LOC 0 1/26/2021 19:51 1/26/2021 20:00 1 WH 1 1/26/2021 9:04 1/26/2021 11:00 6 WH 2 1/26/2021 9:04 1/26/2021 11:00 6 WH 3 1/13/2021 9:11 1/25/2021 10:00 1 NEOB 4 1/22/2021 8:56 1/25/2021 11:00 1 NEOB ... ... ... ... ... 21329 10/5/21 11:37 10/5/21 0:00 3 OEOB 21330 9/30/21 3:52 10/12/21 0:00 2 OEOB 21331 10/25/21 9:02 10/26/21 0:00 2 OEOB 21332 10/6/21 10:00 10/6/21 0:00 2 OEOB 21333 10/19/21 11:24 10/19/21 0:00 3 OEOB [21334 rows x 4 columns]
import datetime as datetime
import re
time_per_building = whdf.iloc[:,[10,11,14,21]]
whdf['duration'] = [0] * whdf.count()
def sanitizeDate(dateString):
if len(dateString) < 10:
dateString = dateString + " 00:00"
lastIndexOfslash = dateString.rfind('/')+1
lastIndexOfspace = dateString.rfind(' ')
yearValue = dateString[lastIndexOfslash:lastIndexOfspace]
if len(yearValue) == 2:
return datetime.datetime.strptime(dateString, '%m/%d/%y %H:%M')
else:
return datetime.datetime.strptime(dateString, '%m/%d/%Y %H:%M')
#def time_duration():
datetime_obj2 = datetime_obj = None
for row in time_per_building.index:
datetime_string = str(time_per_building["APPT_MADE_DATE"][row])
isInRightFormat = re.match('\d+/\d+/\d+', datetime_string)
if isInRightFormat is not None:
datetime_obj = sanitizeDate(datetime_string)
datetime_string2= str(time_per_building["APPT_START_DATE"][row])
isInRightFormat = re.match('\d+/\d+/\d+', datetime_string2)
if isInRightFormat is not None:
datetime_obj2 = sanitizeDate(datetime_string2)
diff = datetime_obj2 - datetime_obj
whdf['duration'][row] = abs(diff)
print( whdf['duration'])
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_993/7429548.py:30: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy whdf['duration'][row] = abs(diff)
0 0:09:00 1 1:56:00 2 1:56:00 3 12 days, 0:49:00 4 3 days, 2:04:00 ... 21329 11:37:00 21330 11 days, 20:08:00 21331 14:58:00 21332 10:00:00 21333 11:24:00 Name: duration, Length: 21334, dtype: object
This code shows the index of the row number and difference in time between the date the appointment was scheduled and then attended. As you can see from comparing the two truncated tables above, the first row shows an appointment that was scheduled and then attended just 9 minutes later! The second row shows an appointment which was made and then was able to be attended just 1 hour and 56 minutes later. However, scanning down this second table, we can see that some appointments were booked days or weeks in advance. Which building do you think had the longest wait time, overall?
whdf_sliced = whdf.loc[:, ["MEETING_LOC", "duration"]]
whdf_sorted = whdf_sliced.sort_values(['duration'], ascending=True)
#print(whdf_sorted)
import numpy as np
building_stats = whdf.pivot_table(index ="MEETING_LOC", values=["duration"], aggfunc=np.sum)
print(building_stats)
duration MEETING_LOC EEOB 1138 days, 18:47:00 MEETING_LOC 8 days, 1:36:00 NEOB 3668 days, 9:22:00 OEOB 16967 days, 6:31:00 VPR 5453 days, 16:20:00 WH 16007 days, 17:01:00
The building with the shortest wait time is actually the New Executive Office Building (NEOB) with a total wait time of 3,668 days, 9 hours, and 22 minutes. The longest turnaround time in 2021 was for the Eisenhower Executive Office Building, which is also the Old Executive Office Building. Adding their two numbers together gives a whopping 18, 105 days! The White House had the second highest wait time of 16,007 days. The Vice President's Office, in comparison, was a little over three times as fast. That's quite a difference between visiting the President or the Vice President!
Now I want to know which building logged the most visitor entries to get a sense of how many parties attended each. The following code snippet produces a dictionary with total number of visitor log entries per building.
mostpopular_locations = {"WH":0, "NEOB":0, "EEOB":0, "OEOB":0, "VPR":0}
for m in meeting_location:
if m in mostpopular_locations:
mostpopular_locations[m] +=1
else:
mostpopular_locations[m] =1
print(mostpopular_locations)
{'WH': 8064, 'NEOB': 1137, 'EEOB': 837, 'OEOB': 10326, 'VPR': 960, 'MEETING_LOC': 9, nan: 1}
Not surprisingly, the New Executive Office Building (NEOB) got about 1/10th the number of visitors as the Eisenhower Executive Office Building (EEOB and OEOB added together). No wonder it has the shortest turnaround time for getting an appointment! However, note that this does not mean the NEOB is very efficient in processing visitors. For only 1/10th the number of visitors, it is only about 5 times as fast to get an appointment, not 10 times. That's half as efficient as you would expect. Which was the least efficient Executive Office? Dividing the total number of days waiting by the total number of logged visits per building, we get:
Interesting! Now that we know an average wait time to visit the Vice President is about 6 days, let's see who was the most popular with the public in 2021, measured by the number of visitor logs received: the Prisident, the Vice President, or the First Lady?
print("President's number of visitor logs:", whdf["VISITEE_NAMEFIRST"].value_counts().loc["POTUS"])
print("Vice President's number of visitor logs:", whdf["VISITEE_NAMEFIRST"].value_counts().loc["VPOTUS"])
print("First Lady's number of visitor logs:", whdf["VISITEE_NAMEFIRST"].value_counts().loc["FLOTUS"])
President's number of visitor logs: 3406 Vice President's number of visitor logs: 115 First Lady's number of visitor logs: 131
Even more interesting, the Vice President recorded the least amount of visitor logs. The First Lady actually received more visitor log entries than the Vice President! That was definitely not expected! First Lady Jill Biden is certainly more popular than I was expecting! Now I am curious to find out exactly who was the most frequently visited official from across all of the visitor logs. Will it be a Climate National Advisor, as previously mentioned? Or does the public meet with Transportation, Education, National Security, or other officials more? Let's find out!
visitee_lastname = whdf.iloc[:,19]
lastname_counts ={}
for item in visitee_lastname:
if item not in lastname_counts:
lastname_counts[item] = 1
else:
lastname_counts[item]+= 1
#print(lastname_counts)
for key, value in sorted(lastname_counts.items(), key=lambda kv: kv[1], reverse=True):
print("%s: %s" % (key, value))
nan: 3708 Via: 862 Nelson: 583 Teleky: 549 Vallejo: 437 Butler: 301 Taylor: 271 Silva: 263 Sullivan: 243 Rastegar: 242 Guyton: 218 Conley: 216 McGurk: 199 Schaffer: 190 Chhabra: 178 Lang: 175 Neuberger: 175 Donilon: 171 Gonzalez: 162 Isen: 161 Newton: 154 Panzarella-Tarr: 153 Dirksen: 152 Tobin: 140 Tolliver: 140 Carrasquillo: 136 Stanwich: 134 Richmond: 132 Evans: 124 Leaf: 122 Hart: 108 NELSON: 107 Gelfond: 93 LUDWIG: 91 Sanders: 88 Macklin: 86 Joyce: 81 Finer: 80 Abercrombie: 80 Dietrich: 80 Collins: 79 McEldowney: 78 Satkowiak: 76 Stambaugh: 75 Lander: 74 Doyle: 71 Reilly: 71 Rodriguez: 71 MICHEL: 67 Stewart: 67 Green: 67 Garner: 66 Harley: 66 Dien: 66 M.: 65 DeCesaro: 64 R.: 63 Campbell: 61 Berry: 61 Li: 61 Nyirandutiye: 60 Page: 60 St.: 58 Browne: 57 Michel: 55 Velz: 54 Qureshi: 54 Starfield: 54 McCarthy: 52 Finney: 52 Solomon: 51 Burton: 51 Brennan: 51 Grove: 51 Strasser: 50 Cole: 48 Endicott: 48 Neuman: 48 Cerrato: 47 Singh: 46 Etim: 46 REED: 46 Ricchetti: 45 Shull: 44 Venkatraj: 43 Moritsugu: 43 Edgerton: 42 Lamb: 42 Mellul: 42 Smith: 41 Beck: 40 Williams: 39 Walker: 39 Guha: 39 Crowder: 39 Colton: 38 Bair: 38 Carmon: 38 Biden: 38 Labitzky: 37 Print: 37 Durkovich: 37 Lowe: 37 Schildroth: 36 Michael: 36 Kiel: 36 Vela: 36 Geltzer: 36 Laurent-Mantey: 35 McLeod: 35 Lewis: 34 Burnett: 34 Miller: 34 Shaw: 34 Fasano: 34 Herwig: 34 Quaranto: 34 Gustafson: 33 Sherwood-Randall: 33 Hebner: 33 Young: 32 Khan: 32 Matheny: 32 Braese: 32 Levin: 32 Brown: 31 Vaingankar: 31 De: 31 Gordon: 31 Ou: 31 teleky: 30 Cooper: 30 Morse: 30 Scott: 30 Harrell: 29 Kling: 29 Security: 28 Reed: 28 Remus: 28 Bagia: 28 Jacob: 28 Ludwig: 28 Chafin: 27 Greene: 27 Rapp-Hooper: 27 Law: 26 Kenon: 26 Lindsey: 26 Welch: 26 Miles: 26 Nally: 26 Estrada: 25 Eisemann: 25 Wagman: 25 Hennessey: 25 Gentry: 25 Larsen: 25 Wood: 25 Duran: 24 May: 24 Bitar: 24 Grant: 23 Hoch: 23 Elizabeth: 22 Baez: 22 Coburn: 22 Uberoi: 22 Deese: 22 Klinkhardt: 22 Matza: 22 Clark: 22 Dela-Cruz: 22 Botello: 22 Castro: 22 Gary: 21 HARTUNG: 21 Kendall: 21 Kerrigan: 21 Skillman: 21 Kalina: 21 Wolf: 21 Rosenberg: 21 LaBelle: 21 Abraham: 21 GARCIA: 21 Alexander: 20 Garcia: 20 Sloat: 20 Forrest: 20 Seymour: 20 Hallett: 20 McKelvey: 20 Zients: 20 Office: 19 Morrison: 19 Schmoyer: 19 McCready: 19 Rault: 19 Dinkel-Smith: 19 Sacks: 19 Bristol: 19 McGuire: 19 Powell: 19 Sermeno: 18 Rabens: 18 McMullan: 18 Pantangco: 18 Inglis: 18 Banks: 17 OFFICE: 17 Hinson: 17 Cameron: 17 gelfond: 17 Christopher: 17 Swaner: 17 Ried: 17 Diaz: 17 Dickson: 17 Alcorn: 17 Coari: 17 Pyle: 17 Kahangama: 17 Deming: 16 Schultz: 16 Pickitt: 16 Koizumi: 16 Corbisiero: 16 Jon: 16 Austin: 16 Singla: 16 Puri: 16 Neyman: 15 Toves: 15 Ackerman: 15 Tederick: 15 Naranjo: 15 Levy: 15 Martin: 15 Sulmeyer: 15 Bridge: 15 Scully: 15 Bain: 15 Levin/Edgard: 15 Wham: 15 Fabrycky: 15 Psaki: 15 Wessel: 15 Nightengale: 14 Wexler: 14 McKay: 14 Mulugheta: 14 Abdelnabi: 14 Bagenstos: 14 Black: 14 Bumb: 14 Parikh: 14 Sanchez: 14 Saba: 14 Wilson: 13 Sweeney: 13 Armstrong: 13 Jones: 13 John: 13 Murray: 13 Bernstein: 13 SMITH: 13 Swan: 13 Bunn: 13 McBroom: 13 Denney: 13 Vidinghoff: 13 Ngugi: 13 Linkins: 13 James: 12 Jain: 12 Geyer: 12 Meloski: 12 Cannizzaro: 12 Teshome: 12 Finkelstein: 12 Wenger: 12 Purcell: 12 Brister: 12 Harris: 12 Condon: 12 Mital: 12 Rioja-Scott: 12 Vargas: 12 Klein: 12 Poethig: 12 Johnstone: 12 Charlesworth: 12 vankeuren: 11 Carroll: 11 Stires: 11 Berger: 11 Bender: 11 Kwak: 11 Hill: 11 Thomas: 11 Baker: 11 Herman: 11 Denaro: 11 Rouse: 11 Hartje: 11 Belkind: 11 Wolbert: 11 Sierra-Zorita: 11 Mallory: 11 Redmon: 11 William: 10 Mills: 10 Askins: 10 Osburn: 10 Valdes-Recio: 10 Fakory: 10 Rudenshiold: 10 White: 10 Pena: 10 Dimmick: 10 Rozo: 10 Corvin: 10 Pohl: 10 Hahn: 10 Hanna: 10 Pedrotty: 10 Etienne: 10 Duarte: 10 Kitching: 10 House: 10 Starks: 10 Ververis: 10 Gensmer: 10 Finley: 10 Tam: 10 Weather: 10 Tarre: 10 Sharma: 10 Cronin: 10 A.: 10 Dennison: 9 VISITEE_NAMELAST: 9 Knott: 9 Dayrone: 9 Raymond: 9 Sperling: 9 Tibayan: 9 Tillson: 9 Pope: 9 Simmons: 9 Davis: 9 Kosoglu: 9 OKane: 9 Paik: 9 Gleason: 9 McIntee: 9 Apreza: 9 Parker: 9 Collins/: 9 Amo: 9 Rosier: 8 West: 8 Cravaritis: 8 Lissner: 8 Frick: 8 Klain: 8 Eng: 8 Rivera: 8 Nakagawa: 8 Mazzeo: 8 Wittenberg: 8 Cedarbaum: 8 GILLILAND: 8 Deutsch: 8 Groum: 8 SWAN: 8 Donelson: 8 Lin: 8 LaBovick: 8 McMillan: 8 Webster: 7 Jake: 7 Johnson: 7 Tom: 7 Murphy: 7 Perez: 7 Beatty: 7 Orero: 7 Wallace: 7 Shaul: 7 Brian: 7 Ribeiro: 7 Bick: 7 Brigham: 7 Gingrich: 7 Clare: 7 Tillett: 7 Cahn: 7 BAKKEN: 7 Fromstein: 7 Hughes: 7 Eastham: 7 Elledge: 7 Qu: 7 Horton: 7 Figueroa: 7 L.: 6 Patrick: 6 Ramoutar: 6 Hussain: 6 Demers: 6 Spooner: 6 Franklin: 6 Sheonath: 6 Meier: 6 Tahan: 6 Withers: 6 McCormick: 6 Ubinas: 6 Velz/Vice: 6 Pedersen: 6 Oates: 6 MacWilliams: 6 Zachary: 6 Liz: 6 Raghavan: 6 Halton: 6 Dunderdale: 6 Sedlock: 6 Horn: 6 Restrepo: 6 Barber: 6 Bouri: 6 Grube: 6 Passerello: 6 Fuchs: 6 Clarke: 6 Shah: 6 Malm: 6 Monte: 6 Cotthoff: 6 Megna: 6 Gonzalez/OMB: 6 Tran: 6 Flournoy: 6 Zaidi: 6 Erb: 5 Herrera: 5 Rouleau: 5 Bonavina: 5 Moon: 5 Kagan: 5 Cipriani: 5 Po: 5 Saenz: 5 Director: 5 Good: 5 Liu: 5 Yagan: 5 Quam: 5 Tanielian: 5 A: 5 Weister: 5 Bhattacharya: 5 Rubini: 5 Vivirito: 5 Olavarria: 5 Sigmon: 5 Crumbley: 5 Gastfriend: 5 Duncan: 5 Gentleman: 5 Geis: 5 Bell: 5 Sawyer: 5 Pilch: 5 Palermo: 5 Seymour/Jeff: 5 KENON: 5 Kalbaugh: 5 Russell: 5 DEHLER: 5 Aguilar: 5 Rice: 5 Pangelinan: 5 SHUMAN: 5 Pelton: 5 Bazan: 5 Ellis: 5 Bibbs: 5 Keller: 5 Vu: 5 Frisk: 5 Backemeyer: 5 Cisneros: 5 Outman: 4 Lady: 4 Wright: 4 Perry: 4 Chasez: 4 Morris: 4 Gray: 4 Alex: 4 Brooks: 4 Deeks: 4 and: 4 Merriam: 4 Kelley: 4 Traver: 4 Liedel: 4 Varghese: 4 Houston: 4 Lopez: 4 Flaherty: 4 Dukovich: 4 ORTIZ: 4 Kinkle: 4 Doshi: 4 Primeggia: 4 Clausen: 4 Kalathil: 4 Peters: 4 Laws: 4 Yupari: 4 Pass: 4 Baxley: 4 Hasaj: 4 Sandy: 4 Creedon: 4 Mason: 4 Pearce: 4 Trainor: 4 Wu: 4 Tudose: 4 Chittenden: 4 Takeo: 4 Hayes: 4 Stanley: 4 Foltz: 4 Dunn: 4 Kuseski: 4 Pough: 4 Galer: 4 POWELL: 4 Egbewole: 4 Anim: 4 Banner: 4 Maurice: 4 Millones: 4 Amanda: 4 Hebbeler: 4 Shimer: 4 Binns: 4 COLLINS: 4 Hodges: 4 Trimble: 4 Hooks: 4 Bargzie: 4 Ostazeski: 4 Neelakantan: 4 Gruber: 3 Shambley: 3 Stubbs: 3 -: 3 ASKINS: 3 Chiu: 3 Hayashi: 3 kendall: 3 Holbrook: 3 Jamison: 3 Henderson: 3 Lebowitz: 3 Locke: 3 Eisenhauer: 3 Embree: 3 Toan: 3 Kaloudis: 3 Dahn: 3 Christensen: 3 Bullard: 3 Kanapathy: 3 Swenson: 3 Nicole: 3 Teets: 3 Sharbatoghlie: 3 Lazatin: 3 Hosier: 3 Aguilera: 3 Haney: 3 Vankeuren: 3 McClellan: 3 Lipin: 3 McGowan: 3 Machniak: 3 Holden: 3 Elmore: 3 Jonas: 3 Ravindranath: 3 Maykish: 3 Iwai: 3 Biagi: 3 Makaneole: 3 Bean: 3 Palomino: 3 Duax: 3 Recordon: 3 Sanderson: 3 Keeffe: 3 Nee: 3 Raisner: 3 Naulin: 3 HORNCASTLE: 3 U-O: 3 Bartee: 3 PPR: 3 Scheidegger: 3 Greenwood: 3 Wall: 3 Rajgopal: 3 Fabina: 3 Nancy: 3 Adury: 3 Rosanna: 3 Brody: 3 Alstott: 3 Rinkevich: 3 Teitelbaum: 3 Mayock: 3 McLaurin: 3 Horncastle: 3 Reynolds: 3 Ruffin: 3 Abenes: 3 Songer: 3 Moriarty: 3 Berengaut: 2 Security/USSS: 2 HENRY: 2 moon: 2 Riley: 2 Leon: 2 Grantz: 2 Kidd: 2 Giammarella: 2 D.: 2 Schonau: 2 Turner: 2 Asante: 2 Gerard: 2 Service: 2 Moses: 2 Tyler: 2 Asset: 2 Goldensohn: 2 Bovee: 2 Anne: 2 Dube: 2 Packer: 2 Putnam: 2 Rosenberger: 2 Scatliffe: 2 Lainez: 2 Bell-Carter: 2 Montoya: 2 S.: 2 Gene: 2 Hochla: 2 Morgan: 2 Kikukawa: 2 Vellajo: 2 Vid: 2 Szymanski: 2 Assets: 2 Billingslea: 2 Shot: 2 Moller: 2 Brann: 2 Haggard: 2 Nassif: 2 Jeff: 2 Burgess-Gregg: 2 Control: 2 Zuluaga: 2 Key: 2 Hwang: 2 Dotter: 2 Mendez: 2 Col: 2 F.: 2 Mensah: 2 Medina: 2 Holt: 2 SERMENO: 2 Davida: 2 Hoagland: 2 Lason: 2 Rissman: 2 Lucera: 2 McTear: 2 McMullen: 2 DUARTE: 2 Harper: 2 Secretary: 2 Gina: 2 Renner: 2 Stolt: 2 Sara: 2 Brant: 2 McFall: 2 Ehrick: 2 of: 2 Jackson: 2 Mallon: 2 Cahill: 2 Moylan: 2 Butterworth: 2 Nemeroff: 2 Citron: 2 Childs-Graham: 2 C.: 2 Freeman: 2 Jeter: 2 Diedrick: 2 Gilmore: 2 Hunt: 2 Crowe: 2 Joshua: 2 Narayanan: 2 Hinzman: 2 Juskys: 2 Czin: 2 Flores: 2 Su: 2 Petrelius: 2 Dinsmore: 2 GRAY: 2 Lucius: 2 Jordan: 2 Stalker-Lehoux: 2 Bakken: 2 Hebert: 2 Rottman: 2 Moore: 2 Callanan: 2 Reynoso: 2 Stewart/Morgan: 2 Choucair: 2 Pauwels: 2 Pidot: 2 Westbrook: 2 Haller: 2 Alvarado: 2 Weyand: 2 Meilicke: 2 VanLandingham: 2 Costello: 2 Berschinski: 2 Challacombe: 2 Skiles: 2 Hamlett/Edith: 2 Henry: 2 Mauss: 2 Barrett: 2 Stavish: 2 Freshwater: 2 Derfner: 2 Snopek: 2 Mielke: 2 Deveau: 2 Kates: 2 Donate: 2 VALLEJO: 2 Siddique: 2 Kash: 2 Shurleff: 2 VPOTUS: 2 Devermont: 2 Wiener: 2 Manning: 2 Crump: 2 Hansen: 2 Stalcup: 2 Tracey-Mooney: 2 Hardin: 2 Wixtead: 2 Domnitz: 2 Marcus: 1 Amin: 1 Haines: 1 Szestak: 1 Burgess: 1 Ramos: 1 Section: 1 Ambassador: 1 Manchin: 1 Conover: 1 Bofill: 1 Germ: 1 Bryan: 1 Weaver: 1 Stripling: 1 Rathbun: 1 Waldon: 1 Brockbank: 1 Byrd: 1 J.: 1 Valdez: 1 BUTLER: 1 Ford: 1 Iltebir: 1 Lo: 1 coward: 1 Iway: 1 Valdes-recio: 1 Kurt: 1 Linda: 1 Glen: 1 Malachowski: 1 Lane: 1 Rauch: 1 Coles: 1 Bressler: 1 MALM: 1 Somani: 1 Harley/WHMO: 1 KERRIGAN: 1 Rafi: 1 McAnaw: 1 Hartmann: 1 Ryan: 1 Burgess-Greg: 1 Ops: 1 Seth: 1 Chicarella: 1 Forni: 1 MBOMA: 1 Fairweather: 1 MCKAY: 1 McWilliams: 1 Paige: 1 Blinken: 1 Pauloski: 1 Matthew: 1 McClees: 1 SINGH: 1 Linke: 1 Panigrahi: 1 vanKeuren: 1 Fletcher: 1 askins: 1 Schlichtenmyer: 1 HARLEY: 1 Hewitt: 1 Maju: 1 Darryl: 1 Woolfolk: 1 Penninger: 1 Office-: 1 Beth: 1 Hoover: 1 Schumacher: 1 Fuller: 1 Meyer: 1 Long: 1 Nolan: 1 Skillman/Mike: 1 GUGGENHEIMER: 1 Onstott: 1 Westling: 1 Buttigieg: 1 Granholm: 1 Moran: 1 Elmergawi: 1 Rosenberg/COVID: 1 Pauls: 1 Eglovitch: 1 MUGGLEBURG: 1 McCubbin: 1 COVID: 1 vidinghoff: 1 DUNCAN: 1 Gorman: 1 Swarens: 1 Waterman: 1 Cranford: 1 P.: 1 Thompson: 1 Naeem: 1 Coblentz: 1 Hunnicutt: 1 Millison: 1 Robert: 1 Vice: 1 Carmichael: 1 Hurd: 1 Zamor: 1 Policy: 1 Eichner: 1 Anderson: 1 Feagans: 1 Mortelmans: 1 Lambinicio: 1 EMHOFF: 1 Gammache: 1 Adler: 1 Mcready: 1 coari: 1 Parent: 1 Mann: 1 Groth: 1 Asa: 1 Farrar: 1 Shawna: 1 GROVE: 1 Budzinski: 1 BROUSE: 1 Volin: 1 Zunker: 1 Boyd: 1 Rhodes: 1 Goldman: 1 Skinner: 1 Mceldowney: 1 Neff: 1 Phadke: 1 Vail: 1 Maynard: 1 SarahJane: 1 Webb: 1 South: 1 SOUTH: 1 Curtin: 1 BOVEE: 1 Yu: 1 Cooch: 1 Martorana: 1 Glymph: 1 Grosnicklaus: 1 bonavina: 1 GOODHUE: 1 Mathes: 1 Hays: 1 M: 1 Beem: 1 Darrin: 1 McElrath: 1 Bedingfield: 1 deForest: 1 Markell: 1 Kamin: 1 Bibs: 1 Venkatasubramanian: 1 Donaghey: 1 Blair: 1 Rabuck: 1 Sheth: 1 Hidalgo: 1 DeBlanc-Knowles: 1 Gins: 1 Christoher: 1 Dennis: 1 Dickerson: 1 Le: 1 lamb: 1 Caesar: 1 Lanchantin: 1 office: 1 Vinson: 1 Tirabasi: 1 Moten: 1 Voles: 1 Bucevicius: 1 BItar: 1 Griffin: 1 WINNING: 1 Savett: 1 Gao: 1 Barringer: 1 Hernandez: 1 FORRESTER: 1 Sultoon: 1 Daniels: 1 DeGuzman: 1 Mouton: 1 Lindenzweig: 1 Guy: 1 dirksen: 1 HQ: 1 Houghton: 1 Fuldauer: 1 kenon: 1 Goldstein: 1 STAFF: 1 Dubon: 1 Boushey: 1 Negron: 1 Bahran: 1 Calderone: 1 Princeton: 1 Maley: 1 Fischietto: 1 Philip: 1 Raviv: 1 Pasquantino: 1 Levitt: 1 LAmb: 1 Poddar: 1
Searching by last name of visitee and placing the results in descending order, we can see that Dan Via was THE most visited person in the Executive Branch, after the President. Dan Via is the Deputy Director of Ceremonies and Protocol for the Unites States Air Force. It looks like ceremonial functions for the military beat the climate! We just might need some climate ceremonies in the future! (joke).The second most visited person was David Nelson, a Deputy Social Secretary. Deputy Social Secretaries make guest lists, seating arrangements, and menus for official state functions. It appears again that people are mostly visiting about the set up of ceremonies. In the code snippet below, I am just checking that the values for last name of Nelson belong to the same first name because Nelson is a common last name. It looks like, below, that it checks out.
name_nelson = whdf["VISITEE_NAMELAST"] == "Nelson"
result = whdf.loc[name_nelson, "VISITEE_NAMEFIRST"]
print(result)
2995 David 4604 David 4611 David 6745 David 6746 David ... 20851 David 20854 David 20970 David 21150 David 21210 David Name: VISITEE_NAMEFIRST, Length: 583, dtype: object
The third most visited person was Ed Teleky. Similar to the former, a quick Google search reveals that he is the Director of Ceremonies for the White House. I have to say that I am a bit surprised by these results. I can't wait to see what the year 2022 will tell us in data.
And, finally, just for a little bit of fun, this last little table below tells us that if your first name was Michael or your last name was Smith, you were most likely a top visitor to the Executive Branch, by name frequency.
whdf.describe()
NAMELAST | NAMEFIRST | NAMEMID | UIN | BDGNBR | ACCESS_TYPE | TOA | POA | TOD | POD | ... | VISITEE_NAMELAST | VISITEE_NAMEFIRST | MEETING_LOC | MEETING_ROOM | CALLER_NAME_LAST | CALLER_NAME_FIRST | CALLER_ROOM | DESCRIPTION | RELEASEDATE | duration | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 21334 | 21333 | 21333 | 21333 | 10675 | 21333 | 13972 | 4663 | 2239 | 6959 | ... | 17626 | 21332 | 21333 | 21333 | 21333 | 21333 | 9 | 17201 | 4191 | 21334 |
unique | 7839 | 3748 | 27 | 7811 | 1375 | 2 | 9250 | 20 | 1371 | 18 | ... | 1006 | 724 | 6 | 924 | 427 | 293 | 1 | 16 | 8 | 3661 |
top | SMITH | MICHAEL | N | U27154 | 178263 | VA | 7/4/2021 18:52 | B0401 | 3/31/2021 | B04 | ... | Via | POTUS | OEOB | SOUTH LAWN | BUTLER | NICOLE | CALLER_ROOM | 11/1/2021 | 7/29/2021 | 3 days 13:27:00 |
freq | 143 | 465 | 5058 | 1155 | 40 | 21324 | 59 | 2481 | 42 | 3761 | ... | 862 | 3406 | 10326 | 1238 | 2420 | 2415 | 9 | 5616 | 1508 | 133 |
4 rows × 29 columns
Was exploring this dataset fun for you? Do you want to start learning Python, too? I am learning online from https://www.dataquest.io/. The thing I like best about it is that there are no videos, so you can go at your speed reading pace (which is faster than watching videos for a lot of people) and easily navigate back to review topics (no rewinding videos to find that magic spot!). I highly recommend giving their learning platform a try! It is very beginner-friendly. See you in my next project!