Note: Click on "Kernel" > "Restart Kernel and Run All" in JupyterLab after finishing the exercises to ensure that your solution runs top to bottom without any errors. If you cannot run this file on your machine, you may want to open it in the cloud .
The exercises below assume that you have read the first part of Chapter 11.
The ...
's in the code cells indicate where you need to fill in code snippets. The number of ...
's within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas.
This notebook is a hands-on and tutorial-like application to show how to load data from web services like Google Maps and use them to solve a logistics problem, namely a Traveling Salesman Problem .
Imagine that a tourist lands at Berlin's Tegel Airport in the morning and has his "connecting" flight from Schönefeld Airport
in the evening. By the time, the flights were scheduled, the airline thought that there would be only one airport in Berlin.
Having never been in Berlin before, the tourist wants to come up with a plan of sights that he can visit with a rental car on his way from Tegel to Schönefeld.
With a bit of research, he creates a list
of sights
like below.
arrival = "Berlin Tegel Airport (TXL), Berlin"
sights = [
"Alexanderplatz, Berlin",
"Brandenburger Tor, Pariser Platz, Berlin",
"Checkpoint Charlie, Friedrichstraße, Berlin",
"Kottbusser Tor, Berlin",
"Mauerpark, Berlin",
"Siegessäule, Berlin",
"Reichstag, Platz der Republik, Berlin",
"Soho House Berlin, Torstraße, Berlin",
"Tempelhofer Feld, Berlin",
]
departure = "Berlin Schönefeld Airport (SXF), Berlin"
With just the street addresses, however, he cannot calculate a route. He needs latitude
-longitude
coordinates instead. While he could just open a site like Google Maps in a web browser, he wonders if he can download the data with a bit of Python code using a web API offered by Google.
So, in this notebook, we solve the entire problem with code.
In order to obtain coordinates for the given street addresses above, a process called geocoding, we use the Google Maps Geocoding API.
Q1: Familiarize yourself with this documentation, register a developer account, create a project, and create an API key that is necessary for everything to work! Then, enable the Geocoding API and link a billing account!
Info: The first 200 Dollars per month are not charged (cf., pricing page), so no costs will incur for this tutorial. You must sign up because Google simply wants to know the people using its services.
Q2: Assign the API key as a str
object to the key
variable!
key = " < your API key goes here > "
To use external web services, our application needs to make HTTP requests just like web browsers do when surfing the web.
We do not have to implement this on our own. Instead, we use the official Python Client for the Google Maps Services provided by Google in one of its corporate GitHub repositories .
Q3: Familiarize yourself with the googlemaps package! Then, install it with the
pip
command line tool!
!pip install googlemaps
Q4: Finish the following code cells and instantiate a Client
object named api
! Use the key
from above. api
provides us with a lot of methods to talk to the API.
import googlemaps
api = googlemaps.Client(...)
api
type(api)
Q5: Execute the next code cell to list the methods and attributes on the api
object!
[x for x in dir(api) if not x.startswith("_")]
To obtain all kinds of information associated with a street address, we call the geocode()
method with the address as the sole argument.
For example, let's search for Brandenburg Gate. Its street address is "Brandenburger Tor, Pariser Platz, Berlin"
.
Q6: Execute the next code cell!
Hint: If you get an error message, follow the instructions in it to debug it.
If everything works, we receive a list
with a single dict
in it. That means the Google Maps Geocoding API only knows about one place at the address. Unfortunately, the dict
is pretty dense and hard to read.
api.geocode("Brandenburger Tor, Pariser Platz, Berlin")
Q7: Capture the first and only search result in the brandenburg_gate
variable and "pretty print" it with the help of the pprint() function in the pprint
module in the standard library
!
response = api.geocode("Brandenburger Tor, Pariser Platz, Berlin")
brandenburg_gate = ...
The dict
has several keys that are of use for us: "formatted_address"
is a cleanly formatted version of the address. "geometry"
is a nested dict
with several lat
-lng
coordinates representing the place where "location"
is the one we need for our calculations. Lastly, "place_id"
is a unique identifier that allows us to obtain further information about the address from other Google APIs.
from pprint import pprint
pprint(brandenburg_gate)
Place
Class¶To keep our code readable and maintainable, we create a Place
class to manage the API results in a clean way.
The .__init__()
method takes a street_address
(e.g., an element of sights
) and a client
argument (e.g., an object like api
) and stores them on self
. The place's .name
is parsed out of the street_address
as well: It is the part before the first comma. Also, the instance attributes .latitude
, .longitude
, and .place_id
are initialized to None
.
Q8: Finish the .__init__()
method according to the description!
The .sync_from_google()
method uses the internally kept client
and synchronizes the place's state with the Google Maps Geocoding API. In particular, it updates the .address
with the formatted_address
and stores the values for .latitude
, .longitude
, and .place_id
. It enables method chaining.
Q9: Implement the .sync_from_google()
method according to the description!
Q10: Add a read-only .location
property on the Place
class that returns the .latitude
and .longitude
as a tuple
!
class Place:
"""A place connected to the Google Maps Geocoding API."""
# answer to Q8
def __init__(self, street_address, *, client):
"""Create a new place.
Args:
street_address (str): street address of the place
client (googlemaps.Client): access to the Google Maps Geocoding API
"""
...
...
...
...
...
...
def __repr__(self):
cls, name = self.__class__.__name__, self.name
synced = " [SYNCED]" if self.place_id else ""
return f"<{cls}: {name}{synced}>"
# answer to Q9
def sync_from_google(self):
"""Download the place's coordinates and other info."""
response = ...
first_hit = ...
... = first_hit[...]
... = first_hit[...][...][...]
... = first_hit[...][...][...]
... = first_hit[...]
return ...
# answer to Q10
...
...
...
Q11: Verify that the instantiating a Place
object works!
brandenburg_gate = Place("Brandenburger Tor, Pariser Platz, Berlin", client=api)
brandenburg_gate
Q12: What do the angle brackets <
and >
mean in the text representation?
< your answer >
Now, we can obtain the geo-data from the Google Maps Geocoding API in a clean way. As we enabled method chaining for .sync_from_google()
, we get back the instance after calling the method.
Q13: Verify that the .sync_from_google()
method works!
brandenburg_gate.sync_from_google()
brandenburg_gate.address
brandenburg_gate.place_id
brandenburg_gate.location
Place
Class (continued): Batch Synchronization¶Q14: Add an alternative constructor method named .from_addresses()
that takes an addresses
, a client
, and a sync
argument! addresses
is a finite iterable of str
objects (e.g., like sights
). The method returns a list
of Place
s, one for each str
in addresses
. All Place
s are initialized with the same client
. sync
is a flag and defaults to False
. If it is set to True
, the alternative constructor invokes the .sync_from_google()
method on the Place
s before returning them.
class Place:
"""A place connected to the Google Maps Geocoding API."""
# answers from above
# answer to Q14
...
def from_addresses(cls, addresses, *, client, sync=False):
"""Create new places in a batch.
Args:
addresses (iterable of str's): the street addresses of the places
client (googlemaps.Client): access to the Google Maps Geocoding API
Returns:
list of Places
"""
places = ...
for address in addresses:
place = ...
if sync:
...
...
return places
Q15: Verify that the alternative constructor works with and without the sync
flag!
Place.from_addresses(sights, client=api)
Place.from_addresses(sights, client=api, sync=True)
!pip install folium
Q17: Execute the code cells below to create an empty map of Berlin!
import folium
berlin = folium.Map(location=(52.513186, 13.3944349), zoom_start=14)
type(berlin)
folium.Map
instances are shown as interactive maps in Jupyter notebooks whenever they are the last expression in a code cell.
berlin
In order to put something on the map, folium works with so-called
Marker
objects.
Q18: Review its docstring and then create a marker m
with the location data of Brandenburg Gate! Use the brandenburg_gate
object from above!
Hint: You may want to use HTML tags for the popup
argument to format the text output on the map in a nicer way. So, instead of just passing "Brandenburger Tor"
as the popup
argument, you could use, for example, "<b>Brandenburger Tor</b><br/>(Pariser Platz, 10117 Berlin, Germany)"
. Then, the name appears in bold and the street address is put on the next line. You could use an f-string to parametrize the argument.
folium.Marker?
m = folium.Marker(
location=...,
popup=...,
tooltip=...,
)
type(m)
Q19: Execute the next code cells that add m
to the berlin
map!
m.add_to(berlin)
berlin
Place
Class (continued): Marker Representation¶Q20: Finish the .as_marker()
method that returns a Marker
instance when invoked on a Place
instance! The method takes an optional color
argument that uses folium 's
Icon
type to control the color of the marker.
class Place:
"""A place connected to the Google Maps Geocoding API."""
# answers from above
# answer to Q20
def as_marker(self, *, color="blue"):
"""Create a Marker representation of the place.
Args:
color (str): color of the marker, defaults to "blue"
Returns:
marker (folium.Marker)
Raises:
RuntimeError: if the place is not synchronized with
the Google Maps Geocoding API
"""
if not self.place_id:
raise RuntimeError("must synchronize with Google first")
return folium.Marker(
location=...,
popup=...,
tooltip=...,
icon=folium.Icon(color=color),
)
Q21: Execute the next code cells that create a new Place
and obtain a Marker
for it!
Notes: Without synchronization, we get a RuntimeError
. .as_marker()
can be chained right after .sync_from_google()
brandenburg_gate = Place("Brandenburger Tor, Pariser Platz, Berlin", client=api)
brandenburg_gate.as_marker()
brandenburg_gate.sync_from_google().as_marker()
Q22: Use the alternative .from_addresses()
constructor to create a list
named places
with already synced Place
s!
places = Place.from_addresses(sights, client=api, sync=True)
places
Map
Class¶To make folium 's
Map
class work even better with our Place
instances, we write our own Map
class wrapping folium 's. Then, we add further functionality to the class throughout this tutorial.
The .__init__()
method takes mandatory name
, center
, start
, end
, and places
arguments. name
is there for convenience, center
is the map's initial center, start
and end
are Place
instances, and places
is a finite iterable of Place
instances. Also, .__init__()
accepts an optional initial_zoom
argument defaulting to 12
.
Upon initialization, a folium.Map
instance is created and stored as an implementation detail _map
. Also, .__init__()
puts markers for each place on the _map
object: "green"
and "red"
markers for the start
and end
locations and "blue"
ones for the places
to be visited. To do that, .__init__()
invokes another .add_marker()
method on the Map
class, once for every Place
object. .add_marker()
itself invokes the .add_to()
method on the folium.Marker
representation of a Place
instance and enables method chaining.
To keep the state in a Map
instance consistent, all passed in arguments except name
are treated as implementation details. Otherwise, a user of the Map
class could, for example, change the start
attribute, which would not be reflected in the internally kept folium.Map
object.
Q23: Implement the .__init__()
and .add_marker()
methods on the Map
class as described!
Q24: Add a .show()
method on the Map
class that simply returns the internal folium.Map
object!
class Map:
"""A map with plotting and routing capabilities."""
# answer to Q23
def __init__(self, name, center, start, end, places, initial_zoom=12):
"""Create a new map.
Args:
name (str): name of the map
center (float, float): coordinates of the map's center
start (Place): start of the tour
end (Place): end of the tour
places (iterable of Places): the places to be visitied
initial_zoom (integer): zoom level according to folium's
specifications; defaults to 12
"""
self.name = name
...
...
...
... = folium.Map(...)
# Add markers to the map.
...
...
for place in places:
...
def __repr__(self):
return f"<Map: {self.name}>"
# answer to Q24
def show(self):
"""Return a folium.Map representation of the map."""
return ...
# answer to Q23
def add_marker(self, marker):
"""Add a marker to the map.
Args:
marker (folium.Marker): marker to be put on the map
"""
...
return ...
Let's put all the sights, the two airports, and three more places, the Bundeskanzleramt , the Olympic Stadium
, and the East Side Gallery
, on the map.
Q25: Execute the next code cells to create a map of Berlin with all the places on it!
Note: Because we implemented method chaining everywhere, the code below is only one expression written over several lines. It almost looks like a self-explanatory and compact "language" on its own.
berlin = (
Map(
"Sights in Berlin",
center=(52.5015154, 13.4066838),
start=Place(arrival, client=api).sync_from_google(),
end=Place(departure, client=api).sync_from_google(),
places=places,
initial_zoom=10,
)
.add_marker(
Place("Bundeskanzleramt, Willy-Brandt-Straße, Berlin", client=api)
.sync_from_google()
.as_marker(color="orange")
)
.add_marker(
Place("Olympiastadion Berlin", client=api)
.sync_from_google()
.as_marker(color="orange")
)
.add_marker(
Place("East Side Gallery, Berlin", client=api)
.sync_from_google()
.as_marker(color="orange")
)
)
berlin
berlin.show()
Before we can find out the best order in which to visit all the sights, we must calculate the pairwise distances between all points. While Google also offers a Directions API and a Distance Matrix API, we choose to calculate the air distances using the third-party library geopy .
Q26: Familiarize yourself with the documentation and install geopy with the
pip
command line tool!
!pip install geopy
We use geopy primarily for converting the
latitude
-longitude
coordinates into a distance matrix .
Because the earth is not flat , geopy
provides a
great_circle()
function that calculates the so-called orthodromic distance between two places on a sphere.
from geopy.distance import great_circle
Q27: For quick reference, read the docstring of great_circle()
and execute the code cells below to calculate the distance between the arrival
and the departure
!
great_circle?
tegel = Place(arrival, client=api).sync_from_google()
schoenefeld = Place(departure, client=api).sync_from_google()
great_circle(tegel.location, schoenefeld.location)
great_circle(tegel.location, schoenefeld.location).km
great_circle(tegel.location, schoenefeld.location).meters
Place
Class (continued): Distance to another Place
¶Q28: Finish the distance_to()
method in the Place
class that takes a other
argument and returns the distance in meters! Adhere to the given docstring!
class Place:
"""A place connected to the Google Maps Geocoding API."""
# answers from above
# answer to Q28
def distance_to(self, other):
"""Calculate the distance to another place in meters.
Args:
other (Place): the other place to calculate the distance to
Returns:
distance (int)
Raises:
RuntimeError: if one of the places is not synchronized with
the Google Maps Geocoding API
"""
if not self.place_id or not other.place_id:
raise RuntimeError("must synchronize places with Google first")
return ...
Q29: Execute the code cells below to test the new feature!
Note: If done right, object-oriented code reads almost like plain English.
tegel = Place(arrival, client=api).sync_from_google()
schoenefeld = Place(departure, client=api).sync_from_google()
tegel.distance_to(schoenefeld)
Q30: Execute the next code cell to instantiate the Place
s in sights
again!
places = Place.from_addresses(sights, client=api, sync=True)
places
Map
Class (continued): Pairwise Distances¶Now, we add a read-only distances
property on our Map
class. As we are working with air distances, these are symmetric which reduces the number of distances we must calculate.
To do so, we use the combinations() generator function in the itertools
module in the standard library
. That produces all possible
r
-tuple
s from an iterable
argument. r
is 2
in our case as we are looking at origin
-destination
pairs.
Let's first look at an easy example of combinations() to understand how it works: It gives us all the
2
-tuple
s from a list
of five numbers
disregarding the order of the tuple
s' elements.
import itertools
numbers = [1, 2, 3, 4, 5]
for x, y in itertools.combinations(numbers, 2):
print(x, y)
distances
uses the internal ._start
, ._end
, and ._places
attributes and creates a dict
with the keys consisting of all pairs of Place
s and the values being their distances in meters. As this operation is rather costly, we cache the distances the first time we calculate them into a hidden instance attribute ._distances
, which must be initialized with None
in the .__init__()
method.
Q31: Finish the .distances
property as described!
class Map:
"""A map with plotting and routing capabilities."""
# answers from above with a tiny adjustment
# answer to Q31
...
def distances(self):
"""Return a dict with the pairwise distances of all places.
Implementation note: The results of the calculations are cached.
"""
if not self._distances:
distances = ...
all_pairs = itertools.combinations(
...,
r=2,
)
for origin, destination in all_pairs:
distance = ...
distances[origin, destination] = distance
distances[destination, origin] = distance
self._distances = ...
return ...
We pretty print the total distance matrix.
berlin = Map(
"Berlin",
center=(52.5015154, 13.4066838),
start=Place(arrival, client=api).sync_from_google(),
end=Place(departure, client=api).sync_from_google(),
places=places,
initial_zoom=10,
)
pprint(berlin.distances)
How can we be sure the matrix contains all possible pairs? As we have 9 sights
plus the start
and the end
of the tour, we conclude that there must be 11 * 10 = 110
distances excluding the 0
distances of a Place
to itself that are not in the distance matrix.
n_places = len(places) + 2
n_places * (n_places - 1)
len(berlin.distances)
Let us find the cost minimal order of traveling from the arrival
airport to the departure
airport while visiting all the sights
.
This problem can be expressed as finding the shortest so-called Hamiltonian path from the
start
to end
on the Map
(i.e., a path that visits each intermediate node exactly once). With the "hack" of assuming the distance of traveling from the end
to the start
to be 0
and thereby effectively merging the two airports into a single node, the problem can be viewed as a so-called traveling salesman problem (TSP).
The TSP is a hard problem to solve but also well studied in the literature. Assuming symmetric distances, a TSP with $n$ nodes has $\frac{(n-1)!}{2}$ possible routes. $(n-1)$ because any node can be the start
/ end
and divided by $2$ as the problem is symmetric.
Starting with about $n = 20$, the TSP is almost impossible to solve exactly in a reasonable amount of time. Luckily, we do not have that many sights
to visit, and so we use a brute force approach and simply loop over all possible routes to find the shortest.
In the case of our tourist, we "only" need to try out 181_440
possible routes because the two airports are effectively one node and $n$ becomes $10$.
import math
math.factorial(len(places) + 1 - 1) // 2
Analyzing the problem a bit further, all we need is a list of permutations of the sights as the two airports are always the first and last location.
The permutations() generator function in the itertools
module in the standard library
helps us with the task. Let's see an example to understand how it works.
numbers = [1, 2, 3]
for permutation in itertools.permutations(numbers):
print(permutation)
However, if we use permutations() as is, we try out redundant routes. For example, transferred to our case, the tuples
(1, 2, 3)
and (3, 2, 1)
represent the same route as the distances are symmetric and the tourist could be going in either direction. To obtain the unique routes, we use an if
-clause in a "hacky" way by only accepting routes where the first node has a smaller value than the last. Thus, we keep, for example, (1, 2, 3)
and discard (3, 2, 1)
.
for permutation in itertools.permutations(numbers):
if permutation[0] < permutation[-1]:
print(permutation)
In order to compare Place
s as numbers, we would have to implement, among others, the .__eq__()
special method. Otherwise, we get a TypeError
.
Place(arrival, client=api) < Place(departure, client=api)
As a quick and dirty solution, we use the .location
property on a Place
to do the comparison.
Place(arrival, client=api).location < Place(departure, client=api).location
As the code cell below shows, combining permutations() with an
if
-clause results in the correct number of routes to be looped over.
sum(
1
for route in itertools.permutations(places)
if route[0].location < route[-1].location
)
To implement the brute force algorithm, we split the logic into two methods.
First, we create an .evaluate()
method that takes a route
argument that is a sequence of Place
s and returns the total distance of the route. Internally, this method uses the .distances
property repeatedly, which is why we built in caching above.
Q32: Finish the .evaluate()
method as described!
Second, we create a .brute_force()
method that needs no arguments. It loops over all possible routes to find the shortest. As the start
and end
of a route are fixed, we only need to look at permutation
s of inner nodes. Each permutation
can then be traversed in a forward and a backward order. .brute_force()
enables method chaining as well.
Q33: Finish the .brute_force()
method as described!
Map
Class (continued): Brute Forcing the TSP¶class Map:
"""A map with plotting and routing capabilities."""
# answers from above
# answer to Q32
def evaluate(self, route):
"""Calculate the total distance of a route.
Args:
route (sequence of Places): the ordered nodes in a tour
Returns:
cost (int)
"""
cost = ...
# Loop over all pairs of consecutive places.
origin = ...
for destination in ...:
cost += self.distances[...]
...
return ...
# answer to Q33
def brute_force(self):
"""Calculate the shortest route by brute force.
The route is plotted on the folium.Map.
"""
# Assume a very high cost to begin with.
min_cost = ...
best_route = None
# Loop over all permutations of the intermediate nodes to visit.
for permutation in ...:
# Skip redundant permutations.
if ...:
...
# Travel through the routes in both directions.
for route in (permutation, permutation[::-1]):
# Add start and end to the route.
route = (..., *route, ...)
# Check if a route is cheaper than all routes seen before.
cost = ...
if ...:
min_cost = ...
best_route = ...
# Plot the route on the map
folium.PolyLine(
[x.location for x in best_route],
color="orange", weight=3, opacity=1
).add_to(self._map)
return ...
Q34: Find the best route for our tourist by executing the code cells below!
berlin = Map(
"Berlin",
center=(52.4915154, 13.4066838),
start=Place(arrival, client=api).sync_from_google(),
end=Place(departure, client=api).sync_from_google(),
places=places,
initial_zoom=12,
)
berlin.brute_force().show()