In [1]:

import os
import folium

print(folium.__version__)

0.5.0+105.g065f6f3.dirty

Introduction¶

In this notebook we'll make a choropleth with a timeslider. We'll use my branched version of folium, which contains a plugin with a class called TimeDynamicGeoJson.

The class needs at least two arguments to be instantiated.

A string-serielized geojson containing all the features (i.e., the areas)
A dictionary with the following structure:

styledict = {
    '0': {
        '2017-1-1': {'color': 'ffffff', 'opacity': 1}
        '2017-1-2': {'color': 'fffff0', 'opacity': 1}
        ...
        },
    ...,
    'n': {
        '2017-1-1': {'color': 'ffffff', 'opacity': 1}
        '2017-1-2': {'color': 'fffff0', 'opacity': 1}
        ...
        }
}

In the above dictionary, the keys are the feature-ids.

Using both color and opacity gives us the ability to simultaneously visualize two features on the choropleth. I typically use color to visualize the main feature (like, average height) and opacity to visualize how many measurements were in that group.

Loading the features¶

We use geopandas to load a dataset containing the boundaries of all the countries in the world.

In [2]:

import geopandas as gpd


assert 'naturalearth_lowres' in gpd.datasets.available
datapath = gpd.datasets.get_path('naturalearth_lowres')
gdf = gpd.read_file(datapath)

In [3]:

%matplotlib inline

ax = gdf.plot(figsize=(10, 10))

The GeoDataFrame contains the boundary coordinates, as well as some other data such as estimated population.

In [4]:

gdf.head()

Out[4]:

	pop_est	continent	name	iso_a3	gdp_md_est	geometry
0	28400000.0	Asia	Afghanistan	AFG	22270.0	POLYGON ((61.21081709172574 35.65007233330923,...
1	12799293.0	Africa	Angola	AGO	110300.0	(POLYGON ((16.32652835456705 -5.87747039146621...
2	3639453.0	Europe	Albania	ALB	21810.0	POLYGON ((20.59024743010491 41.85540416113361,...
3	4798491.0	Asia	United Arab Emirates	ARE	184300.0	POLYGON ((51.57951867046327 24.24549713795111,...
4	40913584.0	South America	Argentina	ARG	573900.0	(POLYGON ((-65.50000000000003 -55.199999999999...

Creating the style dictionary¶

Now we generate time series data for each country.

Data for different areas might be sampled at different times, and TimeDynamicGeoJson can deal with that. This means that there is no need to resample the data, as long as the number of datapoints isn't too large for the browser to deal with.

To simulate that data is sampled at different times we random sample data for n_periods rows of data and then pick without replacing n_sample of those rows.

In [5]:

import pandas as pd

n_periods, n_sample = 48, 40

assert n_sample < n_periods

datetime_index = pd.date_range('2016-1-1', periods=n_periods, freq='M')
dt_index_epochs = datetime_index.astype(int) // 10**9
dt_index = dt_index_epochs.astype('U10')

dt_index

Out[5]:

array(['1454205600', '1456714800', '1459393200', '1461985200',
       '1464663600', '1467255600', '1469934000', '1472612400',
       '1475204400', '1477879200', '1480471200', '1483149600',
       '1485828000', '1488250800', '1490929200', '1493521200',
       '1496199600', '1498791600', '1501470000', '1504148400',
       '1506740400', '1509415200', '1512007200', '1514685600',
       '1517364000', '1519786800', '1522465200', '1525057200',
       '1527735600', '1530327600', '1533006000', '1535684400',
       '1538276400', '1540954800', '1543543200', '1546221600',
       '1548900000', '1551322800', '1554001200', '1556593200',
       '1559271600', '1561863600', '1564542000', '1567220400',
       '1569812400', '1572490800', '1575079200', '1577757600'],
      dtype='<U10')

In [6]:

import numpy as np


styledata = {}

for country in gdf.index:
    df = pd.DataFrame(
        {'color': np.random.normal(size=n_periods),
         'opacity': np.random.normal(size=n_periods)},
        index=dt_index
    )
    df = df.cumsum()
    df.sample(n_sample, replace=False).sort_index()
    styledata[country] = df

Note that the geodata and random sampled data is linked through the feature_id, which is the index of the GeoDataFrame.

In [7]:

gdf.loc[0]

Out[7]:

pop_est                                                2.84e+07
continent                                                  Asia
name                                                Afghanistan
iso_a3                                                      AFG
gdp_md_est                                                22270
geometry      POLYGON ((61.21081709172574 35.65007233330923,...
Name: 0, dtype: object

In [8]:

styledata.get(0).head()

Out[8]:

	color	opacity
1454205600	0.562686	0.696947
1456714800	-0.859547	0.692121
1459393200	-0.325092	0.183145
1461985200	-1.430040	0.543378
1464663600	-1.670341	-0.148390

We see that we generated two series of data for each country; one for color and one for opacity. Let's plot them to see what they look like.

In [9]:

ax = df.plot()

Looks random alright. We want to map the column named color to a hex color. To do this we use a normal colormap. To create the colormap, we calculate the maximum and minimum values over all the timeseries. We also need the max/min of the opacity column, so that we can map that column into a range [0,1].

In [10]:

max_color, min_color, max_opacity, min_opacity = 0, 0, 0, 0

for country, data in styledata.items():
    max_color = max(max_color, data['color'].max())
    min_color = min(max_color, data['color'].min())
    max_opacity = max(max_color, data['opacity'].max())
    max_opacity = min(max_color, data['opacity'].max())

Define and apply maps:

In [11]:

from branca.colormap import linear

cmap = linear.PuRd_09.scale(min_color, max_color)

def norm(x):
    return (x - x.min()) / (x.max() - x.min())

for country, data in styledata.items():
    data['color'] = data['color'].apply(cmap)
    data['opacity'] = norm(data['opacity'])

In [12]:

styledata.get(0).head()

Out[12]:

	color	opacity
1454205600	#e3d8eb	0.341257
1456714800	#ede8f3	0.340823
1459393200	#e9e4f1	0.294992
1461985200	#f0ecf5	0.327429
1464663600	#f2eef6	0.265140

Finally we use pd.DataFrame.to_dict() to convert each dataframe into a dictionary, and place each of these in a map from country id to data.

In [13]:

styledict = {
    str(country): data.to_dict(orient='index') for
    country, data in styledata.items()
}

Finally we can create the choropleth. I like to use the Stamen Toner tileset because its monochrome colors makes it neutral and clean looking.

If the time slider above doesn't show up in the notebook, it's probably because the output is being cropped. Try loading the saved .html file in your browser for the fullscreen experience.

In [14]:

from folium.plugins import TimeSliderChoropleth

m = folium.Map([0, 0], tiles='Stamen Toner', zoom_start=2)

g = TimeSliderChoropleth(
    gdf.to_json(),
    styledict=styledict,

).add_to(m)

m.save(os.path.join('results', 'TimeSliderChoropleth.html'))

m

Out[14]: