This Jupyter notebook demonstrates an example age-depth model using Chron.jl. For more information, see github.com/brenhinkeller/Chron.jl and doi.org/10.17605/osf.io/TQX3F
If running this notebook as an online Binder notebook and the webpage times out, click the badge at left to relaunch (refreshing will not work). Note that any changes will be lost!
Hint: shift
-enter
to run a single cell, or from the Cell
menu select Run All
to run the whole file. Any code from this notebook can also be copied and pasted into the Julia REPL or a .jl
script.
# Load (and install if necessary) the Chron.jl package
using Chron
using Plots, DelimitedFiles
# Input the number of samples we wish to model (must match below)
nSamples = 4
# Make an instance of a ChronSection object for nSamples
smpl = ChronAgeData(nSamples)
smpl.Name = ("Sample 1", "Sample 2", "Sample 3", "Sample 4") # Et cetera
smpl.Age_14C = [ 6991, 7088, 7230, 7540,] # Measured ages
smpl.Age_14C_sigma = [ 30, 70, 50, 50,] # Measured 1-σ uncertainties
smpl.Height[:] = [ -355, -380,-397.0,-411.5,] # Depths below surface should be negative
smpl.Height_sigma[:] = fill(0.01, nSamples) # Usually assume little or no sample height uncertainty
smpl.Age_Sidedness[:] = zeros(nSamples) # Sidedness (zeros by default: geochron constraints are two-sided). Use -1 for a maximum age and +1 for a minimum age, 0 for two-sided
smpl.Path = "MyData/" # Where do you want output files to be stored
smpl.Age_Unit = "Years BP"; # Unit of measurement for ages
smpl.Height_Unit = "m"; # Unit of measurement for Height and Height_sigma
Note that smpl.Height must increase with increasing stratigraphic height -- i.e., stratigraphically younger samples must be more positive. For this reason, it is convenient to represent depths below surface as negative numbers.
Using the IntCal20 radiocarbon calibration curve
smpl.Params = fill(NaN, length(intcal20.Age_Calendar), nSamples)
hdl = plot(layout=nSamples)
for i = 1:nSamples
# The likelihood that a measured 14C age could result from a sample of
# a given calendar age is proportional to the intergral of the product
# of the two respective distributions
likelihood = normproduct.(smpl.Age_14C[i], smpl.Age_14C_sigma[i], intcal20.Age_14C, intcal20.Age_sigma)
likelihood ./= sum(likelihood) * intcal20.dt # Normalize
# Estimate mean and standard deviation by drawing samples from distribution
samples = draw_from_distribution(likelihood, 10^6) .* maximum(intcal20.Age_Calendar)
smpl.Age[i] = nanmean(samples)
smpl.Age_sigma[i] = nanstd(samples)
smpl.Age_025CI[i] = nanpctile(samples,2.5)
smpl.Age_975CI[i] = nanpctile(samples,97.5)
# Populate smpl.Params with log likelihood for each sample
smpl.Params[:,i] = normproduct_ll.(smpl.Age_14C[i], smpl.Age_14C_sigma[i], intcal20.Age_14C, intcal20.Age_sigma)
# Plot likelihood vector for each sample
t = (intcal20.Age_Calendar .> smpl.Age[i] - 5*smpl.Age_sigma[i]) .& (intcal20.Age_Calendar .< smpl.Age[i] + 5*smpl.Age_sigma[i])
plot!(hdl[i], intcal20.Age_Calendar[t], likelihood[t], label="", xlabel="Calendar Age", ylabel="Likelihood", title="$(smpl.Name[i])", framestyle=:box)
end
display(hdl)
note: to spare Binder's servers, this demo uses
config.nsteps = 3000
config.burnin = 2000*npoints_approx
However, you probably want higher numbers for a publication-quality result, for instance
config.nsteps = 30000 # Number of steps to run in distribution MCMC
config.burnin = 10000*npoints_approx # Number to discard
and examine the log likelihood plot to make sure you've converged.
To run the stratigraphic MCMC model, we call the StratMetropolis14C
function. If you're not using radiocarbon ages and want to run stratigraphic model with Gaussian mean age and standard deviation for some number of stratigraphic horizons, then you can set smpl.Age
and smpl.Age_sigma
directly, but then you'll need to call StratMetropolis
instead of StratMetropolis14C
# Configure the stratigraphic Monte Carlo model
config = StratAgeModelConfiguration()
# If in doubt, you can probably leave these parameters as-is
config.resolution = 0.2 # Same units as sample height. Smaller is slower!
config.bounding = 0.5 # how far away do we place runaway bounds, as a fraction of total section height
(bottom, top) = extrema(smpl.Height)
npoints_approx = round(Int,length(bottom:config.resolution:top) * (1 + 2*config.bounding))
config.nsteps = 3000 # Number of steps to run in distribution MCMC
config.burnin = 2000*npoints_approx # Number to discard
config.sieve = round(Int,npoints_approx) # Record one out of every nsieve steps
# Run the stratigraphic MCMC model
(mdl, agedist, lldist) = StratMetropolis14C(smpl, config)
# Write the results to file
run(`mkdir -p $(smpl.Path)`) # Make sure that the path exists
writedlm(smpl.Path*"agedist.csv", agedist, ',') # Stationary distribution of the age-depth model
writedlm(smpl.Path*"height.csv", mdl.Height, ',') # Stratigraphic heights corresponding to each row of agedist
writedlm(smpl.Path*"lldist.csv", lldist, ',') # Log likelihood distribution (to check for stationarity)
# Plot the log likelihood to make sure we're converged (n.b burnin isn't recorded)
hdl = plot(lldist,xlabel="Step number",ylabel="Log likelihood",label="",line=(0.85,:darkblue),framestyle=:box)
savefig(hdl,smpl.Path*"lldist.pdf")
display(hdl)
[ Info: Generating stratigraphic age-depth model... [ Info: Burn-in: 1132000 steps Burn-in... 100%|█████████████████████████████████████████| Time: 0:00:00 [ Info: Collecting sieved stationary distribution: 1698000 steps Collecting... 100%|██████████████████████████████████████| Time: 0:00:00
# Plot results (mean and 95% confidence interval for both model and data)
hdl = plot([mdl.Age_025CI; reverse(mdl.Age_975CI)],[mdl.Height; reverse(mdl.Height)], fill=(minimum(mdl.Height),0.5,:blue), label="model")
plot!(hdl, mdl.Age, mdl.Height, linecolor=:blue, label="")
plot!(hdl, smpl.Age, smpl.Height, xerror=(smpl.Age-smpl.Age_025CI,smpl.Age_975CI-smpl.Age),label="data",seriestype=:scatter,color=:black)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Height ($(smpl.Height_Unit))", framestyle=:box)
savefig(hdl,smpl.Path*"AgeDepthModel.svg")
savefig(hdl,smpl.Path*"AgeDepthModel.pdf")
display(hdl)
# Stratigraphic height at which to interpolate
height = -404
age_at_height = linterp1s(mdl.Height,mdl.Age,height)
age_at_height_min = linterp1s(mdl.Height,mdl.Age_025CI,height)
age_at_height_max = linterp1s(mdl.Height,mdl.Age_975CI,height)
print("Interpolated age at height=$height: $age_at_height +$(age_at_height_max-age_at_height)/-$(age_at_height-age_at_height_min) $(smpl.Age_Unit)")
# Optional: interpolate full age distribution
interpolated_distribution = Array{Float64}(undef,size(agedist,2))
for i=1:size(agedist,2)
interpolated_distribution[i] = linterp1s(mdl.Height,agedist[:,i],height)
end
hdl = histogram(interpolated_distribution, nbins=50, fill=(0.75,:darkblue), label="")
plot!(hdl, xlabel="Age ($(smpl.Age_Unit)) at height=$height", ylabel="Likelihood (unnormalized)", framestyle=:box)
Interpolated age at height=-404: 8192.122299020031 +155.43082431068433/-159.3675004509496 Years BP
There are other things we can plot as well, such as deposition rate:
# Set bin width and spacing
binwidth = round(nanrange(mdl.Age)/10,sigdigits=1) # Can also set manually, commented out below
# binwidth = 100 # Same units as smpl.Age
binoverlap = 10
ages = collect(minimum(mdl.Age):binwidth/binoverlap:maximum(mdl.Age))
bincenters = ages[1+Int(binoverlap/2):end-Int(binoverlap/2)]
spacing = binoverlap
# Calculate rates for the stratigraphy of each markov chain step
dhdt_dist = Array{Float64}(undef, length(ages)-binoverlap, config.nsteps)
@time for i=1:config.nsteps
heights = linterp1(reverse(agedist[:,i]), reverse(mdl.Height), ages)
dhdt_dist[:,i] .= abs.(heights[1:end-spacing] - heights[spacing+1:end]) ./ binwidth
end
# Find mean and 1-sigma (68%) CI
dhdt = nanmean(dhdt_dist,dim=2)
dhdt_50p = nanmedian(dhdt_dist,dim=2)
dhdt_16p = nanpctile(dhdt_dist,15.865,dim=2) # Minus 1-sigma (15.865th percentile)
dhdt_84p = nanpctile(dhdt_dist,84.135,dim=2) # Plus 1-sigma (84.135th percentile)
# Other confidence intervals (10:10:50)
dhdt_20p = nanpctile(dhdt_dist,20,dim=2)
dhdt_80p = nanpctile(dhdt_dist,80,dim=2)
dhdt_25p = nanpctile(dhdt_dist,25,dim=2)
dhdt_75p = nanpctile(dhdt_dist,75,dim=2)
dhdt_30p = nanpctile(dhdt_dist,30,dim=2)
dhdt_70p = nanpctile(dhdt_dist,70,dim=2)
dhdt_35p = nanpctile(dhdt_dist,35,dim=2)
dhdt_65p = nanpctile(dhdt_dist,65,dim=2)
dhdt_40p = nanpctile(dhdt_dist,40,dim=2)
dhdt_60p = nanpctile(dhdt_dist,60,dim=2)
dhdt_45p = nanpctile(dhdt_dist,45,dim=2)
dhdt_55p = nanpctile(dhdt_dist,55,dim=2)
# Plot results
hdl = plot(bincenters,dhdt, label="Mean", color=:black, linewidth=2)
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_16p; reverse(dhdt_84p)], fill=(0,0.2,:darkblue), linealpha=0, label="68% CI")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_20p; reverse(dhdt_80p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_25p; reverse(dhdt_75p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_30p; reverse(dhdt_70p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_35p; reverse(dhdt_65p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_40p; reverse(dhdt_60p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_45p; reverse(dhdt_55p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,bincenters,dhdt_50p, label="Median", color=:grey, linewidth=1)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Depositional Rate ($(smpl.Height_Unit) / $(smpl.Age_Unit) over $binwidth $(smpl.Age_Unit))", fg_color_legend=:white, framestyle=:box)
savefig(hdl,smpl.Path*"DepositionRateModelCI.pdf")
display(hdl)
0.380095 seconds (966.03 k allocations: 93.440 MiB, 5.03% gc time, 93.83% compilation time)
We can also deal with discrete hiatuses in the stratigraphic section if we know where they are and about how long they lasted. We need some different models and methods though. In particular, in addition to the ChronAgeData
struct, we also need a HiatusData
struct now, and will have a new hiatusdist
output.
config.nsteps = 3000 # Number of steps to run in distribution MCMC
config.burnin = 2000*npoints_approx # Number to discard
# Data about hiatuses
nHiatuses = 2 # The number of hiatuses you have data for
hiatus = HiatusData(nHiatuses) # Struct to hold data
hiatus.Height = [-371.5, -405.0 ]
hiatus.Height_sigma = [ 0.0, 0.0 ]
hiatus.Duration = [ 100.0, 123.0]
hiatus.Duration_sigma = [ 30.5, 20.0]
# Run the model. Note the new `hiatus` argument and `hiatusdist` result
(mdl, agedist, hiatusdist, lldist) = StratMetropolis14C(smpl, hiatus, config); sleep(0.5)
# Plot results (mean and 95% confidence interval for both model and data)
hdl = plot([mdl.Age_025CI; reverse(mdl.Age_975CI)],[mdl.Height; reverse(mdl.Height)], fill=(minimum(mdl.Height),0.5,:blue), label="model")
plot!(hdl, mdl.Age, mdl.Height, linecolor=:blue, label="", fg_color_legend=:white)
plot!(hdl, smpl.Age, smpl.Height, xerror=(smpl.Age-smpl.Age_025CI,smpl.Age_975CI-smpl.Age),label="data",seriestype=:scatter,color=:black)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Height ($(smpl.Height_Unit))", framestyle=:box)
# savefig(hdl,smpl.Path*"AgeDepthModelHiatus.pdf");
display(hdl)
[ Info: Generating stratigraphic age-depth model... [ Info: Burn-in: 1132000 steps Burn-in... 100%|█████████████████████████████████████████| Time: 0:00:00 [ Info: Collecting sieved stationary distribution: 1698000 steps Collecting... 100%|██████████████████████████████████████| Time: 0:00:00
As shown below, we have access to the system command line, so we can use the unix command ls
to see all the files we have written. We'll try this first using ;
to access Julia's command-line shell mode:
; ls MyData
AgeDepthModel.pdf AgeDepthModel.svg BootstrappedDistribution.pdf DepositionRateModelCI.pdf DepositionRateModelCI.svg KJ04-70.csv KJ04-70_distribution.pdf KJ04-70_distribution.svg KJ04-70_rankorder.pdf KJ04-70_rankorder.svg KJ04-72.csv KJ04-72_distribution.pdf KJ04-72_distribution.svg KJ04-72_rankorder.pdf KJ04-72_rankorder.svg KJ04-75.csv KJ04-75_distribution.pdf KJ04-75_distribution.svg KJ04-75_rankorder.pdf KJ04-75_rankorder.svg KJ08-157.csv KJ08-157_distribution.pdf KJ08-157_distribution.svg KJ08-157_rankorder.pdf KJ08-157_rankorder.svg KJ09-66.csv KJ09-66_distribution.pdf KJ09-66_distribution.svg KJ09-66_rankorder.pdf KJ09-66_rankorder.svg KR18-01.csv KR18-01_Concordia.pdf KR18-01_Concordia.svg KR18-01_Pbloss.pdf KR18-01_Pbloss.svg KR18-01_distribution.pdf KR18-01_distribution.svg KR18-04.csv KR18-04_Concordia.pdf KR18-04_Concordia.svg KR18-04_Pbloss.pdf KR18-04_Pbloss.svg KR18-04_distribution.pdf KR18-04_distribution.svg KR18-05.csv KR18-05_Concordia.pdf KR18-05_Concordia.svg KR18-05_Pbloss.pdf KR18-05_Pbloss.svg KR18-05_distribution.pdf KR18-05_distribution.svg agedist.csv distresults.csv height.csv lldist.csv lldist.pdf results.csv
Alternatively, we could do the same thing using the run
function:
run(`ls MyData`)
AgeDepthModel.pdf AgeDepthModel.svg BootstrappedDistribution.pdf DepositionRateModelCI.pdf DepositionRateModelCI.svg KJ04-70.csv KJ04-70_distribution.pdf KJ04-70_distribution.svg KJ04-70_rankorder.pdf KJ04-70_rankorder.svg KJ04-72.csv KJ04-72_distribution.pdf KJ04-72_distribution.svg KJ04-72_rankorder.pdf KJ04-72_rankorder.svg KJ04-75.csv KJ04-75_distribution.pdf KJ04-75_distribution.svg KJ04-75_rankorder.pdf KJ04-75_rankorder.svg KJ08-157.csv KJ08-157_distribution.pdf KJ08-157_distribution.svg KJ08-157_rankorder.pdf KJ08-157_rankorder.svg KJ09-66.csv KJ09-66_distribution.pdf KJ09-66_distribution.svg KJ09-66_rankorder.pdf KJ09-66_rankorder.svg KR18-01.csv KR18-01_Concordia.pdf KR18-01_Concordia.svg KR18-01_Pbloss.pdf KR18-01_Pbloss.svg KR18-01_distribution.pdf KR18-01_distribution.svg KR18-04.csv KR18-04_Concordia.pdf KR18-04_Concordia.svg KR18-04_Pbloss.pdf KR18-04_Pbloss.svg KR18-04_distribution.pdf KR18-04_distribution.svg KR18-05.csv KR18-05_Concordia.pdf KR18-05_Concordia.svg KR18-05_Pbloss.pdf KR18-05_Pbloss.svg KR18-05_distribution.pdf KR18-05_distribution.svg agedist.csv distresults.csv height.csv lldist.csv lldist.pdf results.csv
Process(`ls MyData`, ProcessExited(0))
If you're running this as an online notebook, getting these files out of here may be harder though. For SVG files, one option is to view them in markdown cells, which you should then be able to right click and download as real vector graphics. e.g. pasting something like
<img src="AgeDepthModel.svg" align="left" width="600"/>
in a markdown cell such as this one
Meanwhile, for the csv files we could try something like ; cat agedist.csv
, but agedist is probably too big to print. Let's try using ffsend instead, which should give you a download link. In fact, while we're at it, we might as well archive and zip the entire directory!
# Make gzipped tar archive of the the whole MyData directory
run(`tar -zcf archive.tar.gz ./MyData`);
# Download prebuilt ffsend linux binary
isfile("ffsend") || download("https://github.com/timvisee/ffsend/releases/download/v0.2.65/ffsend-v0.2.65-linux-x64-static", "ffsend")
# Make ffsend executable
run(`chmod +x ffsend`);
; ./ffsend upload archive.tar.gz
/bin/bash: ./ffsend: cannot execute binary file
You could alternatively use the ffsend command in this way to transfer individual files, for instance ; ./ffsend upload MyData/agedist.csv