Learning about spatial data and maps for archaeology (and other things)¶

Spatial Thinking and Skills Exercise for Theory and Practice¶

Made by Rachel Opitz, Archaeology, University of Glasgow¶

Understanding the meanings behind patterns of finds recovered through excavation is a tricky problem. We hope to distinguish activity areas, places devoted to domestic and industrial use, or inhabited places that are distinct from liminal ones. To successfully unravel these patterns, we must look not only at the distributions of different types of finds, but how they correlate with one another, the character of the contexts in which they were recovered, and their own physical and social characteristics. Are they likely to be curated? Are they light and likely to be moved from one area to another by post-depositional processes? It's all a bit of a mess.

The aim of this exercise is for you to:

learn to work real special finds data from an excavation, in all its messiness
start thinking about quantitative and spatial approaches to finds data from excavations and how they can help us better understand the patterns we see

You'll do this using data collected by the Gabii Project, a 10+ year excavation in central Italy.

As you may recall from Archaeology of Scotland, to start working with spatial data and imagery, you need to put together your toolkit. You're currently working inside something called a jupyter notebook. It's a place to keep notes, pictures, code and maps together. You can add tools and data into your jupyter notebook and then use them to ask spatial questions and make maps and visualisations that help answer those questions.

Let's get started... Hit 'Ctrl'+'Enter' to run the code in any cell in the page.¶

In [1]:

%matplotlib inline
# Matplotlib is your tool for drawing graphs and basic maps. You need this!

import pandas as pd
import requests
import fiona
import geopandas as gpd
import ipywidgets as widgets

# These are what we call prerequisites. They are basic toosl you need to get started.
# Pandas manipulate data. Geo-pandas manipulate geographic data. They're also black and white and like to eat bamboo... 
# You need these to manipulate your data!
# Fiona helps with geographic data.
# Requests are for asking for things. It's good to be able to ask for things.
# ipywidgets supports interactivity.


# Remember to hit Ctrl+Enter to make things happen!

In [ ]:

url = 'http://ropitz.github.io/digitalantiquity/data/gabii_SU.geojson'
# This is where I put the data. It's in a format called geojson, used to represent geometry (shapes) and attributes (text).
request = requests.get(url)
# Please get me the data at that web address (url)
b = bytes(request.content)
# I will use the letter 'b' to refer to the data, like a nickname
with fiona.BytesCollection(b) as f:
    crs = f.crs
    gabii_su_poly = gpd.GeoDataFrame.from_features(f, crs=crs)
    print(gabii_su_poly.head())
# I will use the fiona tool to wrap up all the data from 'b', check the coordinate system (crs) listed in the features
# and print out the first few lines of the file so I can check everything looks ok. 
# Don't worry if you don't understand all the details of this part!

In [ ]:

# Now we have polygons, the shapes of our contexts. Let's visualise the data to double check that all is well

gabii_map1 = gabii_su_poly.plot(column='DESCRIPTIO', cmap='Blues', edgecolor='grey', figsize=(15, 15));
# 'plot' means draw me an image showing the geometry of each feature in my data. 
# We want to control things like the color of different types of features on our map. 
# I used the 'Blues' colorscale command (cmap stands for 'colour map') 
# and asked it to draw the polygons differently based on the type of feature.

The colorscale options are: Accent, Accent_r, Blues, Blues_r, BrBG, BrBG_r, BuGn, BuGn_r, BuPu, BuPu_r, CMRmap, CMRmap_r, Dark2, Dark2_r, GnBu, GnBu_r, Greens, Greens_r, Greys, Greys_r, OrRd, OrRd_r, Oranges, Oranges_r, PRGn, PRGn_r, Paired, Paired_r, Pastel1, Pastel1_r, Pastel2, Pastel2_r, PiYG, PiYG_r, PuBu, PuBuGn, PuBuGn_r, PuBu_r, PuOr, PuOr_r, PuRd, PuRd_r, Purples, Purples_r, RdBu, RdBu_r, RdGy, RdGy_r, RdPu, RdPu_r, RdYlBu, RdYlBu_r, RdYlGn, RdYlGn_r, Reds, Reds_r, Set1, Set1_r, Set2, Set2_r, Set3, Set3_r, Spectral, Spectral_r, Wistia, Wistia_r, YlGn, YlGnBu, YlGnBu_r, YlGn_r, YlOrBr, YlOrBr_r, YlOrRd, YlOrRd_r, afmhot, afmhot_r, autumn, autumn_r, binary, binary_r, bone, bone_r, brg, brg_r, bwr, bwr_r, cividis, cividis_r, cool, cool_r, coolwarm, coolwarm_r, copper, copper_r, cubehelix, cubehelix_r, flag, flag_r, gist_earth, gist_earth_r, gist_gray, gist_gray_r, gist_heat, gist_heat_r, gist_ncar, gist_ncar_r, gist_rainbow, gist_rainbow_r, gist_stern, gist_stern_r, gist_yarg, gist_yarg_r, gnuplot, gnuplot2, gnuplot2_r, gnuplot_r, gray, gray_r, hot, hot_r, hsv, hsv_r, inferno, inferno_r, jet, jet_r, magma, magma_r, nipy_spectral, nipy_spectral_r, ocean, ocean_r, pink, pink_r, plasma, plasma_r, prism, prism_r, rainbow, rainbow_r, seismic, seismic_r, spring, spring_r, summer, summer_r, tab10, tab10_r, tab20, tab20_r, tab20b, tab20b_r, tab20c, tab20c_r, terrain, terrain_r, viridis, viridis_r, winter, winter_r

Swap out 'Blue' in the cell above for any of these options...

In [ ]:

# Now I'm going to bring in all the basic Gabii special finds data - descriptions, object types, IDs and the contexts from which they come.
# We've had a few special finds over the years.
sf_su = pd.read_csv("https://raw.githubusercontent.com/ropitz/gabii_experiments/master/spf_SU.csv")
sf_su

One of our area supervisors, Troy, is super excited about tools related to textile production. They're a great example of how we think about special finds at Gabii. Multiple types of finds are related to textile production. Do we find all types everywhere? Are certain types of tools more concentrated in one type of context or one area than others? Troy has lots of questions about the patterns of places where we find these tools. Do they provide evidence for early textile production? Are they a major factor in the city's early wealth? Do we find the same things in later periods? After all, people under the Republic and Empire wore clothes... Loom Weights, spools, and spindle whorls are the most common weaving tools at Gabii.

In [ ]:

#Let's pull all those find types out of the big list.
types = ['Loom Weight','Spool','Spindle Whorl']
textile_tools = sf_su.loc[sf_su['SF_OBJECT_TYPE'].isin(types)]
textile_tools

In [ ]:

# Now let's count up how many of these tools appear in each context (SU).
# This command will print out a list of the number of textile tools in each SU next to that SU number.
pd.value_counts(textile_tools['SU'].values, sort=True)

In [ ]:

#Then let's combine our polygons representing context shape and location
#with the special finds data
# We do this with a command called 'merge'

gabii_textools = gabii_su_poly.merge(textile_tools, on='SU')

# adding .head() to the end of a dataframe name will print out just the first few rows.
gabii_textools.head()

In [ ]:

# If we want to see this result as a map, we just add the .plot command to the end of the dataframe's name

gabii_textools.plot(column='SF_OBJECT_TYPE', cmap='Accent', figsize=(15, 15), legend=True, alpha=0.5)

OK, what do you see here? Compare the distribution of each type of textile tool. Do some types seem to be concentrated in certain areas? How might you check? What factors might contribute to this pattern? Do big layer simply aggregate lots of stuff? Do late dumps contain early materials? Why would one type of tool appear where the others don't?

In [ ]:

# We can try and see the relationship between layer size and count by sorting
#our list of finds by the surface area of each layer.
# We use the command 'sort_values'
gabii_textools.sort_values(by=['Shape_Area'],ascending=False)

In [ ]:

# We have a couple enormous colluvial layers that should probably be excuded.
# Outliers will mess with your analysis. Cut out these layers by excluding SUs with a surface area greater than 800.
gabii_textools2 = gabii_textools.loc[gabii_textools['Shape_Area']<800]
# If we want to see this result as a map, we just add the .plot command to the end again.

In [ ]:

# That's better. Plot the results to see that you've removed the big colluvial layers.
gabii_textools2.plot(column='SF_OBJECT_TYPE', cmap='Accent', figsize=(15, 15), legend=True, alpha=0.5)

In [ ]:

# OK, count up how many of each tool type appears in each SU using the 'groupby' command
textools_counts = gabii_textools2.groupby('SU')['SF_OBJECT_TYPE'].value_counts().unstack().fillna(0)
# Sort the list so that the SUs with the most stuff end up at the top.
textools_counts.sort_values(by=['Loom Weight','Spindle Whorl','Spool'], ascending=False)

In [ ]:

# Merge your textile tool counts with your spatial data for the contexts
# Because both dataframes have a 'SU' column, you can use this to match up the rows. 
gabii_textools_counts = gabii_su_poly.merge(textools_counts, on='SU')
gabii_textools_counts.head()

In [ ]:

# Let's start by looking at each class of textile tool individually. 
# Plot the counts of each type of find spatially
gabii_textools_counts.plot(column='Loom Weight', cmap='Accent', figsize=(15, 15), legend=True, alpha=0.5)
gabii_textools_counts.plot(column='Spindle Whorl', cmap='Accent', figsize=(15, 15), legend=True, alpha=0.5)
gabii_textools_counts.plot(column='Spool', cmap='Accent', figsize=(15, 15), legend=True, alpha=0.5)

In [ ]:

base = gabii_textools_counts.plot(column='Loom Weight', cmap='Blues', figsize=(15, 15), legend=True, alpha=0.7)
gabii_textools_counts.plot(ax=base, column='Spindle Whorl', cmap='Reds', alpha=0.7)
gabii_textools_counts.plot(ax=base, column='Spool', cmap='Greens', alpha=0.7);

In [ ]:

# It's hard to see what's happening when we have to scroll. 
# Let's put the maps side by side.
import matplotlib.pyplot as plt
fig, axes = plt.subplots(ncols=3,figsize=(15, 5))
gabii_textools_counts.plot(column='Loom Weight', cmap='autumn',  ax=axes[0], legend=True).axis('equal')
gabii_textools_counts.plot(column='Spindle Whorl', cmap='autumn', ax=axes[1]).axis('equal')
gabii_textools_counts.plot(column='Spool', cmap='autumn',ax=axes[2]).axis('equal')

Can you see any patterns here? Do the different types of tools concentrate in the same parts of the site? Why might different types of tools have different distributions?

In [ ]:

# I think the distributions of different weaving tools vary.
# To investigate further, we are going to need more tools.
import pysal
from sklearn import cluster
import seaborn as sns
import numpy as np

We're going to use cluster analysis to try and better understand our patterns. Clustering is a broad set of techniques for finding groups within a data set. When we cluster observations, we want items in the same group to be similar and items in different groups to be dissimilar. Clustering allows us to identify which things are alike on the basis of multiple characteristics. K-means clustering is a simple and frequently applied clustering method for splitting a dataset into a set of k (k being an arbitrary number you get to choose) groups.

In [ ]:

# Next step: cluster together contexts where the pattern of the three types of textile tools are similar, 
# with and without respect to the size of the context.
# Make 5 clusters and account for the size of the context and counts of different types of tools. Drop all the other fields.
km5 = cluster.KMeans(n_clusters=5)
km5cls = km5.fit(gabii_textools_counts.drop(['geometry', 'OBJECTID','DESCRIPTIO','Shape_Length','SU'], axis=1).values)
km5cls

Each cluster produced should contain the SUs that are similar to one another on the basis of the number of each type of textile tool and the size of the surface area of the SU.

In [ ]:

# Plot the clusters, groups of contexts that have similar textile tool assemblages.
# Give a different colour to the SUs that belong to each cluster.

f1, ax = plt.subplots(1, figsize=(15,15))

gabii_textools_counts.assign(cl=km5cls.labels_)\
   .plot(column='cl', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=ax)

ax.set_axis_off()

plt.show()

In [ ]:

#Do the same, ignoring the size of the context.
km5 = cluster.KMeans(n_clusters=5)
km5cls2 = km5.fit(gabii_textools_counts.drop(['geometry', 'OBJECTID','DESCRIPTIO','Shape_Length','SU','Shape_Area'], axis=1).values)
f2, ax = plt.subplots(1, figsize=(15,15))

gabii_textools_counts.assign(cl2=km5cls2.labels_)\
   .plot(column='cl2', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=ax)

ax.set_axis_off()

plt.show()

The patterns are definitely different. How can we interpret the fact that context size affects the pattern of the distribution of textile tools? Do big units, which perhaps represent dumps or colluvial mashups, have a fundamentally different character than the varied small contexts?

In [ ]:

# Look at the difference with and without context size taken into accoutn.
fig, axes = plt.subplots(ncols=2,figsize=(15, 5))
gabii_textools_counts.assign(cl2=km5cls2.labels_)\
   .plot(column='cl2', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=axes[0]).axis('equal')
gabii_textools_counts.assign(cl=km5cls.labels_)\
   .plot(column='cl', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=axes[1]).axis('equal')

In [30]:

# assign the cluster IDs to each context permanently
gabiitextools_clas = gabii_textools_counts.assign(cl=km5cls.labels_)
gabiitextools_class = gabiitextools_clas.assign(cl2=km5cls2.labels_)
gabiitextools_class.head()

Out[30]:

	DESCRIPTIO	OBJECTID	SU	Shape_Area	Shape_Length	geometry	Loom Weight	Spool	cl	cl2
0	DEP	28	2258	77.018159	45.450094	POLYGON Z ((1416013.191 5144212.5308 61.347999...	0.0	1.0	2	0
1	DEP	44	516	43.687257	39.119255	POLYGON Z ((1415914.1831 5144202.0853 64.69299...	1.0	1.0	3	0
2	DEP	53	587	63.212851	42.805154	POLYGON Z ((1415907.6408 5144196.9608 64.63250...	1.0	0.0	2	4
3	DEP	77	1306	21.609902	35.183961	POLYGON Z ((1415935.544 5144182.803 63.4133000...	0.0	1.0	3	0
4	DEP	115	1327	122.494594	52.841679	POLYGON Z ((1415936.349 5144162.362 63.1217999...	1.0	0.0	4	4

In [31]:

# Now let's look at some individual classes, with and without context size accounted for in the analyses.
gabiitextools_class0=gabiitextools_class.loc[gabiitextools_class['cl']==0]
gabiitextools_class0noarea=gabiitextools_class.loc[gabiitextools_class['cl2']==0]
fig, axes = plt.subplots(ncols=2,figsize=(15, 5))
gabiitextools_class0.plot(ax=axes[0], legend=True).axis('equal')
gabiitextools_class0noarea.plot(ax=axes[1]).axis('equal')

Out[31]:

(1415891.3709749999, 1416036.1953249997, 5144113.125025001, 5144254.3864750005)

In [32]:

# What happens when we change the number of clusters (groups)?
km7 = cluster.KMeans(n_clusters=7)
km7cls3 = km7.fit(gabii_textools_counts.drop(['geometry', 'OBJECTID','DESCRIPTIO','Shape_Length','SU'], axis=1).values)
f3, ax = plt.subplots(1, figsize=(15,15))

gabii_textools_counts.assign(cl3=km7cls3.labels_)\
   .plot(column='cl3', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=ax)

ax.set_axis_off()

plt.show()

That also changes things. Without going into too much detail, finding the ideal number of clusters is a black art. Try playing around with the number of clusters in the notebook, or the size cut-off for inclusion. Clustering = black magic

In [33]:

# Use 7 clusters and plot them
km7 = cluster.KMeans(n_clusters=7)
km7cls4 = km7.fit(gabii_textools_counts.drop(['geometry', 'OBJECTID','DESCRIPTIO','Shape_Length','SU','Shape_Area'], axis=1).values)
f4, ax = plt.subplots(1, figsize=(15,15))

gabii_textools_counts.assign(cl4=km7cls4.labels_)\
   .plot(column='cl4', categorical=True, legend=True, \
         linewidth=0.1, cmap='Accent', edgecolor='white', ax=ax)

ax.set_axis_off()

plt.show()

In [34]:

# Let's set up to investigate some of the individual clusters
gabiitextools_class3=gabiitextools_class.assign(cl3=km7cls3.labels_)
gabiitextools_class4=gabiitextools_class3.assign(cl4=km7cls4.labels_)
gabiitextools_class4.head()

Out[34]:

	DESCRIPTIO	OBJECTID	SU	Shape_Area	Shape_Length	geometry	Loom Weight	Spool	cl	cl2	cl3	cl4
0	DEP	28	2258	77.018159	45.450094	POLYGON Z ((1416013.191 5144212.5308 61.347999...	0.0	1.0	2	0	6	1
1	DEP	44	516	43.687257	39.119255	POLYGON Z ((1415914.1831 5144202.0853 64.69299...	1.0	1.0	3	0	2	6
2	DEP	53	587	63.212851	42.805154	POLYGON Z ((1415907.6408 5144196.9608 64.63250...	1.0	0.0	2	4	6	6
3	DEP	77	1306	21.609902	35.183961	POLYGON Z ((1415935.544 5144182.803 63.4133000...	0.0	1.0	3	0	5	1
4	DEP	115	1327	122.494594	52.841679	POLYGON Z ((1415936.349 5144162.362 63.1217999...	1.0	0.0	4	4	1	6

In [35]:

# set up variables to store several classes, with and without context size taken into account.
gabiitextools_class0=gabiitextools_class4.loc[gabiitextools_class4['cl']==0]
gabiitextools_class0noarea=gabiitextools_class4.loc[gabiitextools_class4['cl2']==0]
gabiitextools_k7_class0=gabiitextools_class4.loc[gabiitextools_class4['cl3']==0]
gabiitextools_k7_class0noarea=gabiitextools_class4.loc[gabiitextools_class4['cl4']==0]
fig, axes = plt.subplots(ncols=2,nrows=2,figsize=(15, 10))
gabiitextools_class0.plot(ax=axes[0,0]).axis('equal')
axes[0,0].set_title('cl - 5 clusters - area')
gabiitextools_class0noarea.plot(ax=axes[0,1]).axis('equal')
axes[1,0].set_title('cl2 - 5 clusters - no area')
gabiitextools_k7_class0.plot(ax=axes[1,0]).axis('equal')
axes[0,1].set_title('cl3 - 7 clusters - area')
gabiitextools_k7_class0noarea.plot(ax=axes[1,1]).axis('equal')
axes[1,1].set_title('cl - 7 clusters - no area')

Out[35]:

Text(0.5,1,'cl - 7 clusters - no area')

In [36]:

gabiitextools_class3=gabiitextools_class4.loc[gabiitextools_class4['cl']==3]
gabiitextools_class3noarea=gabiitextools_class4.loc[gabiitextools_class4['cl2']==3]
gabiitextools_k7_class3=gabiitextools_class4.loc[gabiitextools_class4['cl3']==3]
gabiitextools_k7_class3noarea=gabiitextools_class4.loc[gabiitextools_class4['cl4']==3]
fig, axes = plt.subplots(ncols=2,nrows=2,figsize=(15, 10))
gabiitextools_class0.plot(ax=axes[0,0]).axis('equal')
axes[0,0].set_title('cl - 5 clusters - area')
gabiitextools_class0noarea.plot(ax=axes[0,1]).axis('equal')
axes[1,0].set_title('cl2 - 5 clusters - no area')
gabiitextools_k7_class0.plot(ax=axes[1,0]).axis('equal')
axes[0,1].set_title('cl3 - 7 clusters - area')
gabiitextools_k7_class0noarea.plot(ax=axes[1,1]).axis('equal')
axes[1,1].set_title('cl - 7 clusters - no area')

Out[36]:

Text(0.5,1,'cl - 7 clusters - no area')

In [37]:

# Maybe some of our (especially the small ones) contexts are similar to or influenced by their immediate neighbours (surroundings)
# We can weight the values in one context to account for its neighbour friends.
w5 = pysal.weights.KNN.from_dataframe(gabiitextools_class4, k=5)
w5.transform = 'r'
 #neighbors & weights of the 5th observation (0-index remember)
w5[4]

Out[37]:

{70: 0.2, 29: 0.2, 89: 0.2, 49: 0.2, 53: 0.2}

In [38]:

# print out a context and its immediate neigbours as a table
self_and_neighbors = [4]
self_and_neighbors.extend(w5.neighbors[4])
print(self_and_neighbors)
gabiitextools_class4.loc[self_and_neighbors]

[4, 70, 29, 89, 49, 53]

Out[38]:

	DESCRIPTIO	OBJECTID	SU	Shape_Area	Shape_Length	geometry	Loom Weight	Spindle Whorl	cl	cl2	cl3	cl4
4	DEP	115	1327	122.494594	52.841679	POLYGON Z ((1415936.349 5144162.362 63.1217999...	1.0	0.0	4	4	1	6
70	DEP	2007	1412	14.598572	18.034358	POLYGON Z ((1415950.9555 5144163.0872 62.55220...	1.0	0.0	0	4	5	6
29	DEP	1204	1182	6.973412	20.548798	POLYGON Z ((1415939.8974 5144165.7659 63.16879...	1.0	0.0	0	4	0	6
89	FL	2647	1173	62.610385	86.251233	POLYGON Z ((1415934.0065 5144169.8251 63.22079...	2.0	0.0	2	4	6	6
49	DEP	1766	1279	14.907877	31.751468	POLYGON Z ((1415936.5361 5144180.7074 63.37669...	4.0	1.0	0	4	5	0
53	N-S TOMB	1815	1380	0.384362	2.404675	POLYGON Z ((1415956.7658 5144170.944 62.338099...	1.0	0.0	0	4	0	6

In [39]:

# Do the same thing with another set
# You can substitute other values for the '11' here and see what happens.
w5[11]
self_and_neighbors = [11]
self_and_neighbors.extend(w5.neighbors[11])
print(self_and_neighbors)
gabiitextools_class4.loc[self_and_neighbors]

[11, 79, 128, 126, 5, 107]

Out[39]:

	DESCRIPTIO	OBJECTID	SU	Shape_Area	Shape_Length	geometry	Loom Weight	Spindle Whorl	Spool	cl	cl2	cl3	cl4
11	DEP	267	3012	35.964558	24.642614	POLYGON Z ((1415981.4906 5144220.1063 61.93349...	1.0	0.0	0.0	3	4	2	6
79	DEP	2202	3089	41.966089	30.456902	POLYGON Z ((1415984.002 5144225.9554 61.885899...	1.0	0.0	0.0	3	4	2	6
128	FILL	3375	3311	5.350597	8.938988	POLYGON Z ((1415984.1495 5144221.7165 61.46430...	0.0	0.0	2.0	0	0	0	1
126	DEP	3361	3306	8.341407	12.880923	POLYGON Z ((1415984.0478 5144221.7844 61.49159...	0.0	1.0	3.0	0	3	0	4
5	FILL	134	3021	10.873247	13.556097	POLYGON Z ((1415976.6777 5144220.3635 61.86689...	0.0	0.0	3.0	0	0	0	4
107	DEP	2808	3167	9.853576	13.615772	POLYGON Z ((1415982.0646 5144219.3923 61.6106,...	0.0	0.0	1.0	0	0	0	1

In [40]:

# Sanity check by plotting a set of self and neighbours as a map.
# Do the counts of differnt textile tools have similar patterns?
# Are they inconsistent? How might we interpret this local pattern?
n11 = gabiitextools_class4.loc[self_and_neighbors]
n11.plot(column='Loom Weight', cmap='autumn', legend=True)
n11.plot(column='Spool', cmap='autumn')
n11.plot(column='Spindle Whorl', cmap='autumn')

Out[40]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f719f2ec748>

In [41]:

# We can visualise how the counts of different types of finds appear in other ways.
# Do loom weights appear more often when spools do? What does this mean?
sns.pairplot(n11.drop(['SU','geometry','OBJECTID','DESCRIPTIO','Shape_Length','Shape_Area','cl','cl2','cl3','cl4'], axis=1))

Out[41]:

<seaborn.axisgrid.PairGrid at 0x7f719f159358>

In [46]:

# Are some clusters more correlated than others?
sns.pairplot(gabiitextools_class0.drop(['OBJECTID','DESCRIPTIO','Shape_Length','Shape_Area','SU','geometry','cl','cl2','cl3','cl4'], axis=1), kind="reg")
plt.show()

/srv/conda/lib/python3.6/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

In [47]:

# Do 7 clusters as oppossed to 5 result in more correlation?
sns.pairplot(gabiitextools_k7_class0.drop(['OBJECTID','DESCRIPTIO','Shape_Length','Shape_Area','SU','geometry','cl','cl2','cl3','cl4'], axis=1), kind="reg")
plt.show()

/srv/conda/lib/python3.6/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

That concludes this tutorial.¶

Hopefully you have:

started thinking (and perhaps are a bit confused) about how spatial patterns of different types of finds are created, and how we can interpret them when studying data from an excavation.
learned to combine spatial data and descriptive tables.
learned to use some basic clustering tools, and reinforced your knowledge about how to make charts and maps.

We'll be talking more about spatial analysis methods in archaeology throughout the course.