Name/source | Description | Format |
---|---|---|
POI visitor counts (Safegraph) | Contains information on the number of people that stop at POIs all over the US which can be downloaded by zipcode on the website. Data is gathered through location services on smartphones which offers a sample of the whole population. | location_name = "Jordan Pond House" Placekey = "zzy-222@64v-dvx-v75" naics_code = "722511" latitude = 44.32057 longitude = -68.246385 date_range_start = 2020-06-01T00:00:00-04:00 visits_by_day = [0,2,3,5,6,3,5,19,38,4,5,6, …] |
Park visitor counts (NPS) | Contains quality controlled data collected from park staff to get an accurate count of the total number of visitors at a park each month using the person-per-vehicle factor. This single csv file contains information for all the parks in the NPS. | ParkName = "Acadia NP" UnitCode = "ACAD" Year = 2019 Month = 1 RecreationVisits = "10,520" |
Park boundaries (NPS) | Shapefile provided by the NPS that contains the boundaries of all parks in the NPS as multipolygons. | UNIT_CODE = "ACAD" geometry = … |
Entrance locations (NPS) | CSV file containing entrance locations of each park under investigation that was manually created using NPS websites for the park and google maps. | unitCode = "ACAD" entrance = "hulls cove" lat_lon = 44.381534013388006, -68.06868817972637 |
First difference correlation provides a more accurate assessment of correlation for non-stationary time series data because it considers the difference in value between one data point and the next. To determine how closely the Safegraph cell phone data captures National Park visitation we will be utilizing these first difference correlations.
The safe graph data of Glacier is somewhat unstable due to the fact that:
99 POIS is the most abundant, providing more accurate spatial data:
For Great Smoky NP: Compare the interactive maps of Aug.1 2018 with that of Aug.1 2021. Did we observe an increase/decrease in the congestion in the road network after the pandemic?
MT.csv
)nps_boundary.shp
)# Load data and convert dataframe to geodataframe using lon and lat columns
boundary = gpd.read_file('../raw_data/nps_boundary/nps_boundary.shp') # boundary shapefile
boundary = boundary.set_index('UNIT_CODE')
df = pd.read_csv('../raw_data/SafeGraph/MT.csv') # safegraph data
gdf = gpd.GeoDataFrame(df,
crs = "EPSG:4269",
geometry = gpd.points_from_xy(df.sg_c__longitude, df.sg_c__latitude))
# Geocode POIs with missing lot-lat info
gdf_na = gdf[gdf['geometry'].is_empty]
gdf_na["address"] = gdf_na['sg_mp__street_address'] + ', ' + gdf_na['sg_mp__city'] + ', ' + gdf_na['sg_mp__region'] + ' ' + gdf_na['sg_mp__postal_code'].astype(int).astype(str)gdf_na['sg_mp__postal_code'].astype(int).astype(str)
na_poi = gdf_na[['placekey', 'address']].drop_duplicates()
geo = geocode(na_poi['address'], provider='nominatim', user_agent = 'autogis_xx')
geo['geometry'].intersects(boundary.at[park, 'geometry']).sum() # check these POIs don't fall within park boundary, so disregard them
# Subset POIs located within the park boundary of Glacier NP
subset_gdf = gdf.loc[gdf['geometry'].intersects(boundary.at['GLAC', 'geometry'])]
subset_gdf = subset_gdf.dropna(subset = ['sg_mp__visits_by_day']) # drop NA in the daily visitation column due to data error
# Explode daily visitation column to multiple rows
subset_gdf.sg_mp__visits_by_day = subset_gdf.sg_mp__visits_by_day.apply(literal_eval) # convert from string to JSON list
subset_gdf = subset_gdf.explode('sg_mp__visits_by_day') # explode it vertically and have each row representing a day
# Create date column to indicate the exact date of year
subset_gdf['n_of_day'] = subset_gdf.groupby(['date_range_start','placekey']).cumcount()
subset_gdf['date_range_start'] = pd.DatetimeIndex(subset_gdf.date_range_start) # convert from string to date
subset_gdf['date'] = pd.DatetimeIndex(subset_gdf.date_range_start) + pd.to_timedelta(subset_gdf['n_of_day'], unit = 'd')
# save a subset of POIs as GeoJSON file.
subset_gdf.to_file('../clean_data/POI_GLAC.geojson', driver = 'GeoJSON')
GeoJSON
)nps_visit_2015_2021.csv
)# !pip install geoplot
import geopandas as gpd
import pandas as pd
from geopandas.tools import geocode
from ast import literal_eval
import matplotlib.pyplot as plt
import geoplot
import osmnx as ox
import networkx as nx
from tqdm import tqdm, trange
Load the cleaned GeoJSON file:
# Load saved GeoJSON
subset_gdf = gpd.read_file('./data/POI_GLAC.geojson')
# load park boundary
boundary = gpd.read_file('./data/nps_boundary.geojson')
boundary = boundary.set_index('UNIT_CODE')
# convert variable to propoer format
subset_gdf['date_range_start'] = pd.DatetimeIndex(subset_gdf.date_range_start) # convert to datetime
subset_gdf['sg_mp__visits_by_day'] = pd.to_numeric(subset_gdf['sg_mp__visits_by_day']) # convert to numeric
subset_gdf.head()
date_range_start | date_range_end | placekey | sg_c__parent_placekey | sg_c__location_name | sg_c__safegraph_brand_ids | sg_c__brands | sg_c__top_category | sg_c__sub_category | sg_c__naics_code | ... | sg_mp__popularity_by_day | sg_mp__device_type | sg_mp__normalized_visits_by_state_scaling | sg_mp__normalized_visits_by_total_visits | sg_mp__normalized_visits_by_total_visitors | sg_mp__normalized_visits_by_region_naics_visits | sg_mp__normalized_visits_by_region_naics_visitors | n_of_day | date | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2021-08-01 | 2021-09-01 | zzy-222@3wy-prx-xwk | None | Jammer Joe's Grill and Pizzeria | None | None | Restaurants and Other Eating Places | Full-Service Restaurants | 722511.0 | ... | {'Monday': 10, 'Tuesday': 16, 'Wednesday': 12,... | {'android': 29, 'ios': 22} | 1040.77628 | 0.000013 | 0.000037 | 0.000125 | 0.000194 | 0 | 2021-08-01T00:00:00 | POINT (-113.87642 48.61768) |
1 | 2021-08-01 | 2021-09-01 | zzy-222@3wy-prx-xwk | None | Jammer Joe's Grill and Pizzeria | None | None | Restaurants and Other Eating Places | Full-Service Restaurants | 722511.0 | ... | {'Monday': 10, 'Tuesday': 16, 'Wednesday': 12,... | {'android': 29, 'ios': 22} | 1040.77628 | 0.000013 | 0.000037 | 0.000125 | 0.000194 | 1 | 2021-08-02T00:00:00 | POINT (-113.87642 48.61768) |
2 | 2021-08-01 | 2021-09-01 | zzy-222@3wy-prx-xwk | None | Jammer Joe's Grill and Pizzeria | None | None | Restaurants and Other Eating Places | Full-Service Restaurants | 722511.0 | ... | {'Monday': 10, 'Tuesday': 16, 'Wednesday': 12,... | {'android': 29, 'ios': 22} | 1040.77628 | 0.000013 | 0.000037 | 0.000125 | 0.000194 | 2 | 2021-08-03T00:00:00 | POINT (-113.87642 48.61768) |
3 | 2021-08-01 | 2021-09-01 | zzy-222@3wy-prx-xwk | None | Jammer Joe's Grill and Pizzeria | None | None | Restaurants and Other Eating Places | Full-Service Restaurants | 722511.0 | ... | {'Monday': 10, 'Tuesday': 16, 'Wednesday': 12,... | {'android': 29, 'ios': 22} | 1040.77628 | 0.000013 | 0.000037 | 0.000125 | 0.000194 | 3 | 2021-08-04T00:00:00 | POINT (-113.87642 48.61768) |
4 | 2021-08-01 | 2021-09-01 | zzy-222@3wy-prx-xwk | None | Jammer Joe's Grill and Pizzeria | None | None | Restaurants and Other Eating Places | Full-Service Restaurants | 722511.0 | ... | {'Monday': 10, 'Tuesday': 16, 'Wednesday': 12,... | {'android': 29, 'ios': 22} | 1040.77628 | 0.000013 | 0.000037 | 0.000125 | 0.000194 | 4 | 2021-08-05T00:00:00 | POINT (-113.87642 48.61768) |
5 rows × 56 columns
Aggreagate safegraph daily visitaiton to the park-month level: by 'date_range_start'
(column for year-month)
park_month = subset_gdf.groupby('date_range_start')['sg_mp__visits_by_day'].sum().to_frame()
Load NPS monthly visitor counts:
nps = pd.read_csv('./data/nps_visit_2015_2021.csv')
nps = nps[nps['UnitCode'] == 'GLAC'] # filter data for Glacier only
nps['RecreationVisits'] = nps['RecreationVisits'].str.replace(",", "").astype(float) # convert to numeric data
# Create a column to indicate year-month
nps['date'] = pd.to_datetime(nps[['Year', 'Month']].assign(DAY = 1))
Merge Safegraph monthly data to NPS visitation:
month_df = nps.set_index('date').join(park_month, how = 'inner') # Inner join by year-month column
month_df['sg_mp__visits_by_day'] = pd.to_numeric(month_df['sg_mp__visits_by_day']) # convert to numeric
month_df.head()
ParkName | UnitCode | ParkType | Region | State | Year | Month | RecreationVisits | NonRecreationVisits | RecreationHours | ... | RecreationHoursTotal | NonRecreationHoursTotal | ConcessionerLodgingTotal | ConcessionerCampingTotal | TentCampersTotal | RVCampersTotal | BackcountryTotal | NonRecreationOvernightStaysTotal | MiscellaneousOvernightStaysTotal | sg_mp__visits_by_day | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2018-01-01 | Glacier NP | GLAC | National Park | Intermountain | MT | 2018 | 1 | 12222.0 | 14 | 49,913 | ... | 32,211,332 | 118,373 | 107,156 | 0 | 110,507 | 109,566 | 32,349 | 0 | 1,178 | 71 |
2018-02-01 | Glacier NP | GLAC | National Park | Intermountain | MT | 2018 | 2 | 11847.0 | 24 | 48,261 | ... | 32,211,332 | 118,373 | 107,156 | 0 | 110,507 | 109,566 | 32,349 | 0 | 1,178 | 61 |
2018-03-01 | Glacier NP | GLAC | National Park | Intermountain | MT | 2018 | 3 | 21758.0 | 18 | 90,075 | ... | 32,211,332 | 118,373 | 107,156 | 0 | 110,507 | 109,566 | 32,349 | 0 | 1,178 | 137 |
2018-04-01 | Glacier NP | GLAC | National Park | Intermountain | MT | 2018 | 4 | 28404.0 | 70 | 120,797 | ... | 32,211,332 | 118,373 | 107,156 | 0 | 110,507 | 109,566 | 32,349 | 0 | 1,178 | 168 |
2018-05-01 | Glacier NP | GLAC | National Park | Intermountain | MT | 2018 | 5 | 195116.0 | 375 | 1,012,270 | ... | 32,211,332 | 118,373 | 107,156 | 0 | 110,507 | 109,566 | 32,349 | 0 | 1,178 | 1072 |
5 rows × 36 columns
f, a = plt.subplots(figsize = (15, 8))
month_df['sg_mp__visits_by_day'].plot(ax = a, color = 'mediumpurple', label = "SafeGrpah", lw = 3)
plt.legend(loc = "upper left")
a2 = a.twinx()
month_df['RecreationVisits'].plot(ax = a2, linestyle = "--", alpha = .4, label = "NPS data", lw = 3)
plt.legend(loc = "upper right")
plt.legend()
a.set(xlabel = "Date",
ylabel = "Monthly visitation estimated from SafeGrpah",
title = f"Monthly visitation pattern SafeGraph vs. NPS data: GLAC")
a2.set(ylabel = "NPS visitor use statistics");
In addition, calculate Pearson correlation between Safegraph and NPS monthly times series:
# calculate correlation
corr = month_df['sg_mp__visits_by_day'].corr(month_df['RecreationVisits'])
# calculate the correlation between first differnced time series
corr_l1 = month_df[['sg_mp__visits_by_day', 'RecreationVisits']].diff().corr().iloc[0, 1]
print(f'''The Pearson correlation between Safegraph and NPS data is {round(corr, 2)}.''')
print(f'''The Pearson correlation between first-differnced Safegraph and NPS data is {round(corr_l1, 2)}.''')
The Pearson correlation between Safegraph and NPS data is 0.91. The Pearson correlation between first-differnced Safegraph and NPS data is 0.83.
Aggregate safegraph daily visitaiton to the POI-month level: by 'placekey'
(an indicator for POI) and'date_range_start'
(column for month):
poi_month = subset_gdf.groupby(['placekey', 'date_range_start'])['sg_mp__visits_by_day'].sum().to_frame()
Extract geo-coordinates of each POI and merge to the newly created poi_month
:
lon_lat = subset_gdf[['placekey', 'geometry']].drop_duplicates().set_index('placekey')
poi_month = poi_month.join(lon_lat, how = 'left').sort_values(by = "date_range_start") # merge & sort by date
# Convert the dataframe to geodataframe:
poi_month_gdf = gpd.GeoDataFrame(poi_month,
crs = "EPSG:4269",
geometry = poi_month.geometry)
For density plot, we focus on peak visitation season (May to Oct). Here we select months to generate the density plot:
temp = poi_month_gdf.reset_index()
temp = temp[temp['date_range_start'].dt.month.isin(range(5,10))]
temp.head()
placekey | date_range_start | sg_mp__visits_by_day | geometry | |
---|---|---|---|---|
31 | zzy-225@3wy-prx-td9 | 2018-05-01 | 55 | POINT (-113.87915 48.61743) |
32 | zzy-222@3wy-q29-8vz | 2018-05-01 | 315 | POINT (-113.99403 48.52759) |
33 | zzy-226@3wy-prx-td9 | 2018-05-01 | 38 | POINT (-113.87918 48.61740) |
34 | zzw-222@3wy-q29-d9z | 2018-05-01 | 279 | POINT (-113.99294 48.52874) |
35 | zzy-224@3wy-prx-td9 | 2018-05-01 | 25 | POINT (-113.87916 48.61739) |
Density plot of monthly visits to Safegraph POIs overlayed with the Glacier boundary:
proj = geoplot.crs.AlbersEqualArea(central_latitude = boundary[boundary.index == 'GLAC'].centroid.y[0],
central_longitude = boundary[boundary.index == 'GLAC'].centroid.x[0])
months = temp['date_range_start'].unique()
plt.figure(figsize = (20, 20))
for idx, month in enumerate(months):
ax = plt.subplot(4, 5, idx + 1, projection = proj)
geoplot.kdeplot(temp[temp['date_range_start'] == month],
# clip = boundary[boundary.index == 'GLAC'].geometry,
shade = True,
cmap = 'Reds',
#thresh = 0.2,
levels = 10,
alpha = 0.8,
projection = proj,
#cbar = True,
vmin = 0,
#vmax = 309,
ax = ax)
geoplot.polyplot(boundary[boundary.index == 'GLAC'], ax = ax, zorder = 1)
ax.set_title(month.astype('datetime64[M]')) # convert to year-month
C:\Users\simizicee\AppData\Local\Temp\ipykernel_37232\3215083758.py:1: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation. proj = geoplot.crs.AlbersEqualArea(central_latitude = boundary[boundary.index == 'GLAC'].centroid.y[0], C:\Users\simizicee\AppData\Local\Temp\ipykernel_37232\3215083758.py:2: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation. central_longitude = boundary[boundary.index == 'GLAC'].centroid.x[0])
GeoJSON
)nps_boundary.shp
)entrance_coord.csv
)from IPython.display import Image
from IPython.display import HTML
Load entrance station locations:
entry = pd.read_csv(f'./data/entrance_coord.csv')
# split by "," to create lontitude and latitude columns
lat = []
lon = []
for row in entry['lat_lon']:
lat.append(row.split(',')[0]) # Split the row by comma and append
lon.append(row.split(',')[1])
# Create two new columns from lat and lon
entry['lat'] = lat
entry['lon'] = lon
# convert to geodataframe
entry_gdf = gpd.GeoDataFrame(entry,
crs = "EPSG:4269",
geometry = gpd.points_from_xy(entry.lon, entry.lat))
# filter the entrance stations for Glacier
entry_gdf = entry_gdf[entry_gdf.unitCode == 'GLAC'].reset_index(drop = True)
entry_gdf
unitCode | entrance | lat_lon | lat | lon | geometry | |
---|---|---|---|---|---|---|
0 | GLAC | West Entrance | 48.50642809317048, -113.9876462668435 | 48.50642809317048 | -113.9876462668435 | POINT (-113.98765 48.50643) |
1 | GLAC | Camas | 48.58720431401561, -114.03313537578087 | 48.58720431401561 | -114.03313537578087 | POINT (-114.03314 48.58720) |
2 | GLAC | Polebridge | 48.78354575523783, -114.28056051716855 | 48.78354575523783 | -114.28056051716855 | POINT (-114.28056 48.78355) |
3 | GLAC | Many Glacier | 48.82248401654224, -113.57974127136741 | 48.82248401654224 | -113.57974127136741 | POINT (-113.57974 48.82248) |
4 | GLAC | Saint Mary's | 48.74693105028238, -113.43933340609424 | 48.74693105028238 | -113.43933340609424 | POINT (-113.43933 48.74693) |
5 | GLAC | Two Medicine | 48.50504411336611, -113.33000493600153 | 48.50504411336611 | -113.33000493600153 | POINT (-113.33000 48.50504) |
6 | GLAC | Walton/Goat Lick | 48.25888956992005, -113.57509950229125 | 48.25888956992005 | -113.57509950229125 | POINT (-113.57510 48.25889) |
(1) For parks with convex shape (Glacier, Yellowstone, Smokies), it's ideal to create a buffer around the park boudary and use this buffer to query osm road network.
# To create buffer, need to first project to UTM (unit: meters)
buff = boundary.loc[['GLAC']].to_crs(epsg = 26912).buffer(8000) # For Glacier: NAD83 / UTM zone 12N
buff = gpd.GeoDataFrame(geometry = gpd.GeoSeries(buff))
# create network from that buffer
G = ox.graph_from_polygon(buff.to_crs(epsg = 4326).geometry[0], network_type = 'drive', simplify = True)
G = ox.project_graph(G, to_crs = 'epsg:4269') # Reproject
ox.plot_graph(G)
(<Figure size 576x576 with 1 Axes>, <AxesSubplot:>)
(2) For parks with concave shape (Zion and Acadia), better to use the bounding box instead of a buffer.
# # create bounding box based on park boundary
# bound = boundary.loc[[park]].total_bounds
# # Specify the coordinates of four bounds
# north, south, east, west = bound[3], bound[1], bound[2], bound[0]
# # create network from that bounding box
# G = ox.graph_from_bbox(north, south, east, west, network_type = 'drive', simplify = True)
# ox.plot_graph(G)
Convert the OSM road network to GeoPandas GeoDataFrame:
nodes, edges = ox.graph_to_gdfs(G, nodes = True, edges = True, node_geometry = True)
edges.head()
osmid | name | highway | oneway | length | geometry | lanes | ref | maxspeed | bridge | tunnel | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
u | v | key | |||||||||||
40656801 | 40853414 | 0 | 5719884 | Lodgepole Loop | residential | False | 18.957 | LINESTRING (-113.43446 48.26584, -113.43420 48... | NaN | NaN | NaN | NaN | NaN |
40656810 | 0 | 5533468 | NaN | residential | False | 257.073 | LINESTRING (-113.43446 48.26584, -113.43491 48... | NaN | NaN | NaN | NaN | NaN | |
40885323 | 0 | [5719884, 5549846] | [Slippery Bill Road, Lodgepole Loop] | residential | False | 248.138 | LINESTRING (-113.43446 48.26584, -113.43454 48... | NaN | NaN | NaN | NaN | NaN | |
40853414 | 40656801 | 0 | 5719884 | Lodgepole Loop | residential | False | 18.957 | LINESTRING (-113.43420 48.26584, -113.43446 48... | NaN | NaN | NaN | NaN | NaN |
40853512 | 0 | 5546392 | NaN | residential | False | 426.384 | LINESTRING (-113.43420 48.26584, -113.43422 48... | NaN | NaN | NaN | NaN | NaN |
Choose a date that we would like to generate maps for. At the end of the day, we would like to compare the congestion conditions between the same time period of different years, such as the first Friday of August in 2018 vs. 2020.
day = '2018-08-02'
# change variable type
subset_gdf['date'] = pd.DatetimeIndex(subset_gdf.date) # convert string to datetime
subset_gdf['sg_mp__visits_by_day'] = pd.to_numeric(subset_gdf['sg_mp__visits_by_day']) # convert to numeric
# filter data by the chosen date
poi_gdf = subset_gdf.loc[subset_gdf['date'] == day,
['placekey', 'date', 'sg_c__location_name', 'geometry', 'sg_mp__visits_by_day']].reset_index(drop = True)
poi_gdf
placekey | date | sg_c__location_name | geometry | sg_mp__visits_by_day | |
---|---|---|---|---|---|
0 | zzy-222@3wy-prx-xwk | 2018-08-02 | Jammer Joe's Grill and Pizzeria | POINT (-113.87642 48.61768) | 8 |
1 | zzw-222@3wy-q29-d9z | 2018-08-02 | Glacier Park Boat Company | POINT (-113.99294 48.52874) | 25 |
2 | zzy-222@3wy-pbb-bx5 | 2018-08-02 | Flathead Lake Biological Station | POINT (-113.98850 48.52306) | 0 |
3 | 222-222@3wy-q23-b8v | 2018-08-02 | Glacier National Park | POINT (-113.98663 48.50239) | 6 |
4 | 222-222@3wy-q29-8y9 | 2018-08-02 | Schoolhouse Gifts | POINT (-113.99352 48.52658) | 7 |
5 | zzy-224@3wy-prx-td9 | 2018-08-02 | Lake Mcdonald Lodge and Cabins | POINT (-113.87916 48.61739) | 1 |
6 | zzy-223@3wy-q29-jjv | 2018-08-02 | Eddie's Cafe and Gifts | POINT (-113.98852 48.52305) | 0 |
7 | zzy-222@3wy-q29-8vz | 2018-08-02 | Lake McDonald | POINT (-113.99403 48.52759) | 24 |
8 | zzw-222@3wy-m9q-mc5 | 2018-08-02 | Ptarmigan Room | POINT (-113.36454 48.49155) | 0 |
9 | zzy-225@3wy-prx-td9 | 2018-08-02 | Lucke's Lounge | POINT (-113.87915 48.61743) | 2 |
10 | zzy-226@3wy-prx-td9 | 2018-08-02 | Lake McDonald Lodge | POINT (-113.87918 48.61740) | 3 |
11 | zzy-222@3wy-t6w-435 | 2018-08-02 | Glacier Park | POINT (-113.67304 48.79710) | 0 |
Plot entrance stations and POIs together with the park road network:
m = edges[edges.highway == 'secondary'].explore()
poi_gdf.drop(['date'], axis = 1).explore(m = m,
column = 'sg_mp__visits_by_day',
cmap = "RdBu_r",
marker_type = 'circle',
marker_kwds = {'radius': 1000,
'fill': True
}
) # POIs
entry_gdf.explore(m = m,
marker_type = 'marker'
) # entrances
# A function helps you to find the nearest OSM node from a given GeoDataFrame (cr. Dr.Park)
def find_nearest_osm(network, gdf):
"""
# This function helps you to find the nearest OSM node from a given GeoDataFrame
# If geom type is point, it will take it without modification, but
# IF geom type is polygon or multipolygon, it will take its centroid to calculate the nearest element.
Input:
- network (NetworkX MultiDiGraph): Network Dataset obtained from OSMnx
- gdf (GeoDataFrame): stores locations in its `geometry` column
Output:
- gdf (GeoDataFrame): will have `nearest_osm` column, which describes the nearest OSM node
that was computed based on its geometry column
"""
for idx, row in tqdm(gdf.iterrows(), total=gdf.shape[0]):
if row.geometry.geom_type == 'Point':
nearest_osm = ox.distance.nearest_nodes(network,
X=row.geometry.x,
Y=row.geometry.y
)
elif row.geometry.geom_type == 'Polygon' or row.geometry.geom_type == 'MultiPolygon':
nearest_osm = ox.distance.nearest_nodes(network,
X=row.geometry.centroid.x,
Y=row.geometry.centroid.y
)
else:
print(row.geometry.geom_type)
continue
gdf.at[idx, 'nearest_osm'] = nearest_osm
return gdf
Find the nearest OSM node for each POI within the park:
poi_gdf = find_nearest_osm(G, poi_gdf)
poi_gdf
100%|██████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 94.72it/s]
placekey | date | sg_c__location_name | geometry | sg_mp__visits_by_day | nearest_osm | |
---|---|---|---|---|---|---|
0 | zzy-222@3wy-prx-xwk | 2018-08-02 | Jammer Joe's Grill and Pizzeria | POINT (-113.87642 48.61768) | 8 | 960036734.0 |
1 | zzw-222@3wy-q29-d9z | 2018-08-02 | Glacier Park Boat Company | POINT (-113.99294 48.52874) | 25 | 960035394.0 |
2 | zzy-222@3wy-pbb-bx5 | 2018-08-02 | Flathead Lake Biological Station | POINT (-113.98850 48.52306) | 0 | 41049461.0 |
3 | 222-222@3wy-q23-b8v | 2018-08-02 | Glacier National Park | POINT (-113.98663 48.50239) | 6 | 960035763.0 |
4 | 222-222@3wy-q29-8y9 | 2018-08-02 | Schoolhouse Gifts | POINT (-113.99352 48.52658) | 7 | 960036264.0 |
5 | zzy-224@3wy-prx-td9 | 2018-08-02 | Lake Mcdonald Lodge and Cabins | POINT (-113.87916 48.61739) | 1 | 960033938.0 |
6 | zzy-223@3wy-q29-jjv | 2018-08-02 | Eddie's Cafe and Gifts | POINT (-113.98852 48.52305) | 0 | 41049461.0 |
7 | zzy-222@3wy-q29-8vz | 2018-08-02 | Lake McDonald | POINT (-113.99403 48.52759) | 24 | 960036264.0 |
8 | zzw-222@3wy-m9q-mc5 | 2018-08-02 | Ptarmigan Room | POINT (-113.36454 48.49155) | 0 | 960034402.0 |
9 | zzy-225@3wy-prx-td9 | 2018-08-02 | Lucke's Lounge | POINT (-113.87915 48.61743) | 2 | 960033938.0 |
10 | zzy-226@3wy-prx-td9 | 2018-08-02 | Lake McDonald Lodge | POINT (-113.87918 48.61740) | 3 | 960033938.0 |
11 | zzy-222@3wy-t6w-435 | 2018-08-02 | Glacier Park | POINT (-113.67304 48.79710) | 0 | 960035071.0 |
Similarly, find the nearest OSM node for each entrance station:
entry_gdf = find_nearest_osm(G, entry_gdf)
entry_gdf
100%|███████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 113.20it/s]
unitCode | entrance | lat_lon | lat | lon | geometry | nearest_osm | |
---|---|---|---|---|---|---|---|
0 | GLAC | West Entrance | 48.50642809317048, -113.9876462668435 | 48.50642809317048 | -113.9876462668435 | POINT (-113.98765 48.50643) | 43157459.0 |
1 | GLAC | Camas | 48.58720431401561, -114.03313537578087 | 48.58720431401561 | -114.03313537578087 | POINT (-114.03314 48.58720) | 957706395.0 |
2 | GLAC | Polebridge | 48.78354575523783, -114.28056051716855 | 48.78354575523783 | -114.28056051716855 | POINT (-114.28056 48.78355) | 960034730.0 |
3 | GLAC | Many Glacier | 48.82248401654224, -113.57974127136741 | 48.82248401654224 | -113.57974127136741 | POINT (-113.57974 48.82248) | 960037666.0 |
4 | GLAC | Saint Mary's | 48.74693105028238, -113.43933340609424 | 48.74693105028238 | -113.43933340609424 | POINT (-113.43933 48.74693) | 41052104.0 |
5 | GLAC | Two Medicine | 48.50504411336611, -113.33000493600153 | 48.50504411336611 | -113.33000493600153 | POINT (-113.33000 48.50504) | 970994448.0 |
6 | GLAC | Walton/Goat Lick | 48.25888956992005, -113.57509950229125 | 48.25888956992005 | -113.57509950229125 | POINT (-113.57510 48.25889) | 960035238.0 |
edges['travel_times'] = 0
for idx_p, row_p in poi_gdf.iterrows(): # loop for each POI
length = []
for idx_e, row_e in entry_gdf.iterrows(): # Loop for each entrance station
# Calculate the shortest driving distance from every entrance station to a specific POI
temp_length = nx.shortest_path_length(G = G,
source = row_e['nearest_osm'],
target = row_p['nearest_osm'],
weight = 'length',
method = 'dijkstra'
)
length.append(temp_length)
# Find the nearest entrance station from that POI
val, idx = min((val, idx) for (idx, val) in enumerate(length)) # get the position in the list with smallest number
# Get shortest paths from the nearest entrances to that POI. Visitors are assumed to take this route to enter the park.
start_route = nx.shortest_path(G = G,
source = entry_gdf.loc[idx, 'nearest_osm'],
target = row_p['nearest_osm'],
weight = 'length',
method = 'dijkstra'
)
for i in range(len(start_route) - 1):
cum_sum = edges.loc[(edges.index.get_level_values('u') == start_route[i]) & (edges.index.get_level_values('v') == start_route[i+1]), 'travel_times'] + row_p['sg_mp__visits_by_day']
edges.loc[(edges.index.get_level_values('u') == start_route[i]) & (edges.index.get_level_values('v') == start_route[i+1]), 'travel_times'] = cum_sum
# Get shortest paths from the that POI to the nearest entrances. Visitors are assumed to take this route to exit the park.
back_route = nx.shortest_path(G = G,
source = row_p['nearest_osm'],
target = entry_gdf.loc[idx, 'nearest_osm'],
weight = 'length',
method = 'dijkstra'
)
for i in range(len(back_route) - 1):
cum_sum = edges.loc[(edges.index.get_level_values('u') == back_route[i]) & (edges.index.get_level_values('v') == back_route[i+1]), 'travel_times'] + row_p['sg_mp__visits_by_day']
edges.loc[(edges.index.get_level_values('u') == back_route[i]) & (edges.index.get_level_values('v') == back_route[i+1]), 'travel_times'] = cum_sum
# A preview of the summary statistiscs
edges.travel_times.describe()
count 2519.000000 mean 0.344581 std 4.246239 min 0.000000 25% 0.000000 50% 0.000000 75% 0.000000 max 70.000000 Name: travel_times, dtype: float64
m = edges.explore(tiles = 'Stamen Terrain',
column = 'travel_times',
cmap = 'RdYlGn_r',
scheme = 'NaturalBreaks',
k = 5,
vmin = 0,
highlight = True,
tooltip = ['name', 'travel_times'], # displayed upon hovering
popup = True, # all columns displayed upon clicking
legend_kwds = {
'caption': 'Congestion',
'scale': False,
'colorbar': False
}
)
entry_gdf.explore(m = m,
marker_type = 'marker',
tooltip = ['entrance']
) # entrances