This notebook assumes that you've already created the file roman.db in part one of the intro to sql notebook. It shows how to load a sqlite database into R.

In [1]:
# first we get the packages we need:
library(DBI)
library(RSQLite)
In [2]:
# now we open the connection:

con = dbConnect(SQLite(), dbname="roman.db")

# we can see what's inside, eg, what tables are in the database?
alltables = dbListTables(con)
alltables
  1. '_source_info_'
  2. 'amphi'
  3. 'aqua'
  4. 'sqlite_sequence'
In [ ]:
# write the query to get the information you want

myQuery <- dbSendQuery(con, "SELECT * FROM amphi WHERE chronogroup = 'flavian'")

# pass that information to an R object. The n = -1 bit means grab everything until there's nothing left to grab. Otherwise, you can specify how many rows etc.

my_data <- dbFetch(myQuery, n = -1)

# You will get some warning messages as R makes sure that the data in each column is of the same type.
In [ ]:
# Now to view your dataframe, we just ask:

my_data
In [8]:
# now that we're done, clear cache 
# so that we don't eat up all the memory
dbClearResult(myQuery)

# now carry on and begin manipulating my_data
# for more information see
# http://tiffanytimbers.com/querying-sqlite-databases-from-r/
# also perhaps this https://www.r-bloggers.com/using-sqlite-in-r/
In [15]:
# let's make a new dataframe with just the elevation and capacity columns
# you could have done this with an SQL query from our original database.
# here we're just going to extract from the dataframe we created above

amphi_stats <- subset(my_data, select=c("id","elevation","capacity"))

#"create new datafram called amph_stats and pass into it a subset of my_data, returning the array id, elevation, capacity"
In [16]:
amphi_stats
idelevationcapacity
arlesAmphitheater 21 20000
romeFlavianAmphitheater 22 50000
newsteadAmphitheater 83 1500
pozzuoliFlavianAmphitheater 41 35700
segobrigaAmphitheater 817 7500
parisAmphitheater 46 15000
nimesAmphitheater 50 24000
chesterAmphitheater 23 0
arezzoAmphitheater 263 8000
londonAmphitheater 29 6500
urbisagliaAmphitheater 219 5150
vindonissaAmphitheater 361 15142
frejusAmphitheater 12 12000
tebessaAmphitheater 877 9400
aquileiaAmphitheater 2 0
terminiImereseAmphitheater 76 0
bolsenaAmphitheater 417 0
narbonneAmphitheater 14 0
capuaImperialAmphitheater 34 37000
dorchesterAmphitheater 78 0
autunAmphitheater 317 0
bobadelaAmphitheater 383 0
chichesterAmphitheater 14 0

At this point, now that you know how to get your data out of the database, you could do a number of other kinds of data manipulations; it really depends on the kind of investigation you wish to do. Consult Marwick et al How to Do Archaeological Science Using R.

We'll end this notebook with a simple visual plot of the data.

In [ ]:
# a simple x,y plot
# we can call for the columns in the dataframe by refering to the name of the frame and using the $ to specify the column therein.
plot(amphi_stats$elevation, amphi_stats$capacity)
In [25]:
# but that's ugly. let's make it pretty 
# and write the result to file.

# Get the input values. 
input <- amphi_stats[,c('elevation','capacity')]

# Give the chart file a name.
png(file = "R-scatterplot.png")

# Plot the chart for cars with weight between 2.5 to 5 and mileage between 15 and 30.
plot(x = input$elevation,y = input$capacity,
   xlab = "Elevation",
   ylab = "Capacity", 
   xlim = c(0,1000),
   ylim = c(500,60000),
   main = "Elevation vs Capacity in Flavian Amphitheatres"
)
	 
# Save the file.
dev.off()
png: 2

The scatterplot is now available if you right-click on 'jupyter' at the top of the window and open in a new browser window; you'll see it in the list of files.