Archaeologists regularly work with maps and data about where sites, samples and objects are found. We ask lots of questions that have a spatial component. Which Bronze Age cairns are close to the coast in England? In this excavation, is bone found inside a building or outside in the yard? It's important to learn to work with spatial data and maps in order to succeed in a variety of careers in archaeology and heritage management.
To start working with spatial data and maps, you need to put together your toolkit. You're currently working inside something called a jupyter notebook. It's a place to keep notes, pictures, code and maps together. You can add tools and data into your jupyter notebook and then use them to ask spatial questions and make maps and visualisations that help answer those questions.
The aim of this exercise is for you to:
%matplotlib inline
# Matplotlib is your tool for drawing graphs and basic maps. You need this!
import pandas as pd
import requests
import fiona
import geopandas as gpd
import ipywidgets as widgets
import bokeh
# These are what we call prerequisites. They are basic toosl you need to get started.
# Pandas manipulate data. Geo-pandas manipulate geographic data. They're also black and white and like to eat bamboo...
# You need these to manipulate your data!
# Fiona helps with geographic data.
# Requests are for asking for things. It's good to be able to ask for things.
# ipywidgets supports interactivity.
# Remember to hit Ctrl+Enter to make things happen!
url = 'https://raw.githubusercontent.com/ropitz/spatialarchaeology/master/data/gabii_areaB_published2.geojson'
# This is where I put the data. It's in a format called geojson, used to represent geometry (shapes) and attributes (text).
request = requests.get(url)
# Please get me the data at that web address (url)
b = bytes(request.content)
# I will use the letter 'b' to refer to the data, like a nickname
with fiona.BytesCollection(b) as f:
crs = f.crs
gabii_areaB = gpd.GeoDataFrame.from_features(f, crs=crs)
print(gabii_areaB.head())
# I will use the fiona tool to wrap up all the data from 'b', check the coordinate system (crs) listed in the features
# and print out the first few lines of the file so I can check everything looks ok.
# Don't worry if you don't understand all the details of this part!
You should have geometry information about polygons (shapes) of the contexts from area B and a bunch of descriptions and interpretation of the archaeology. Importatly you should be able to spot the column names: Descriptio, objectid, shape_area, shape_leng, SU, definition, finds_note, formation, geometry, interpret. Each column contains a different type of information. Note that SU (stratigraphic unit) = context.
# Let's visualise the data to double check that all is well
gabii_map1 = gabii_areaB.plot(column='SHAPE_Area', cmap='Pastel2', edgecolor='grey', figsize=(10, 10));
# 'plot' means draw me an image showing the geometry of each feature in my data.
# We want to control things like the color of the individual features on our map.
# I used the pastel colorscale command (cmap stands for 'colour map')
# and asked it to draw the polygons differently based on their size.
This is good, but what if you only want to look at one kind of context? We can select specific types of contexts from within our dataset by searching (aka querying) for them.
How do we know what kind of contexts we have? Looking at what's inside the data describing all those polygon shapes on the map should help.
Start by printing out our data in a tidy way. Just type its name...
gabii_areaB
In archaeology we often talk about how contexts get created. This is referred to as their 'formation process' - how they get formed. We will describe different formation processes as deposits, cuts, fills, constructions, etc. Look at the 'formation' column in the table, and you'll see these terms in many of the entries.
# Say you only want to look at the contexts formed by cutting a.k.a. 'Cuts'. Pandas use square brackets [] to make selections.
# Here we select all the rows (.loc) where the column 'formation' has the value 'cutting'. == means 'this thing is true' in code
gabii_areaB.loc[gabii_areaB['formation'] == 'Cutting']
# If we want to see this result as a map, we just add the .plot command to the end.
gabii_areaB.loc[gabii_areaB['formation'] == 'Cutting'].plot(column='SHAPE_Area', cmap='OrRd', figsize=(10, 10))
# Try and do the same thing for contexts that are construction type features
gabii_areaB.loc[gabii_areaB['formation'] == 'Construction']
# Remember to draw it as a map!
gabii_areaB.loc[gabii_areaB['formation'] == 'Construction'].plot(column='SHAPE_Area', cmap='OrRd', figsize=(10, 10))
# Let's save these selections of 'constructions' and 'cuts' so we can use them again.
# I've given them names here. These are now 'named variables'
gabii_areaB_constructions = gabii_areaB.loc[gabii_areaB['formation'] == 'Construction']
gabii_areaB_cuts = gabii_areaB.loc[gabii_areaB['formation'] == 'Cutting']
#Test your named variable by printing it out again, calling it by its name.
gabii_areaB_cuts
So far these searches have been about the attributes of our data. Attributes are information that desribes the pologyons We can also ask questions about spatial relationships or real-world location. First, let's find the bounding box, or real world location and extents of our data. These are the smallest and largest coordinates on the x- and y- axes in the real world. We use the command 'total_bounds' to ask this question. Things 'in bounds' are inside the box.
gabii_areaB_cuts.total_bounds
# Now do the same thing for the constructions. The results should be similar, but not identical.
gabii_areaB_constructions.total_bounds
#Now we will select a single context, SU 5018, a nice wall.
gabii_areaB_5018 = gabii_areaB_constructions.loc[gabii_areaB_constructions['SU'] == 5018]
gabii_areaB_5018
gabii_areaB_5018.plot()
#Now we will select another single context, SU 1178, a nice floor surface. Ok, an oddly shaped floor surface.
gabii_areaB_1178 = gabii_areaB_constructions.loc[gabii_areaB_constructions['SU'] == 1178]
gabii_areaB_1178
gabii_areaB_1178.plot()
Maybe someone was digging a pit or a grave. We can check.
# We use the '.overlay' command to see which contexts from the 'cuts' dataset intersect the floor context# 1178
gpd.overlay(gabii_areaB_1178, gabii_areaB_cuts, how='intersection')
Your results should show one context - 1178 that is a cut that has a spatial relationship with the floor.
We can make a map showing their relationship spatially.
# To do this we have to provide a list of the values we are interested in seeing on the map, in square brackets []
gabii_areaB.loc[gabii_areaB['SU'].isin([1178,1235])].plot(column='SU', cmap='Paired', figsize=(10, 10))
What does this map suggest? I'd say there is another reason for the funny shape of the floor. It is just barely cut by the context 1178.
# Start by printing out the list of all the polygons that represent places where cuts and constructions intersect.
gpd.overlay(gabii_areaB_constructions, gabii_areaB_cuts, how='intersection', use_sindex=True)
#Now draw all the places our construction features intersect with cuts. Our site is like swiss cheese.
gpd.overlay(gabii_areaB_constructions, gabii_areaB_cuts, how='intersection').plot(column='SHAPE_Area', cmap='Accent', edgecolor='grey', figsize=(15, 15))
#You'll notice I picked a differnt colour map this time, and made the figure a bit larger.
'Soils' isn't one of our formations, so we must look elsewhere in our data. Looking at the definitions above, soil is mentioned under the 'definition' column sometimes but it's not the only word in that entry. For example, an entry might read 'Soil below tufo floor 1417'. So we need to look inside the 'definition' entry and search for the word 'soil'. We do this by calling every entry (value) within that column a string 'str' and asking what that string 'contains'. If the word 'soil' appears anywhere, it will match.
gabii_areaB_soil = gabii_areaB[gabii_areaB['definition'].str.contains('soil')]
gabii_areaB_soil
# Now create a map of all the contexts defined as being made of soil.
gabii_areaB_soil.plot(column='SHAPE_Area', cmap='Accent', edgecolor='grey', figsize=(15, 15))
This time let's look for the contexts that have notable finds and see if they are near our construction contextx
Step 1: define your dataset of contexts that have notable finds. This is everwhere that has text in that field, or is 'not null'.
gabii_areaB_cool_finds = gabii_areaB[gabii_areaB['finds_note'].notnull()]
gabii_areaB_cool_finds
Define a 2m area around all the contexts with cool finds. 2m is pretty close by.
gabii_areaB_cool_finds_2m = gabii_areaB_cool_finds.buffer(2)
gabii_areaB_cool_finds_2m.plot(cmap='Accent', edgecolor='grey', figsize=(15, 15))
# Now we plot the intersection between the buffered finds polygons and our construction polygons
gabii_areaB_cool_finds_2m.intersection(gabii_areaB_constructions).plot(cmap='Accent', edgecolor='grey', figsize=(15, 15))
Hopefully you learned to: