Chron.jl standalone age-depth model using radiocarbon data

This Jupyter notebook demonstrates an example age-depth model using Chron.jl. For more information, see github.com/brenhinkeller/Chron.jl and doi.org/10.17605/osf.io/TQX3F

Launch Binder notebook

If running this notebook as an online Binder notebook and the webpage times out, click the badge at left to relaunch (refreshing will not work). Note that any changes will be lost!

Hint: shift-enter to run a single cell, or from the Cell menu select Run All to run the whole file. Any code from this notebook can also be copied and pasted into the Julia REPL or a .jl script.


Load required Julia packages

In [1]:
# Load (and install if necessary) the Chron.jl package
try
    using Chron
catch
    using Pkg
    Pkg.add("Chron")
    using Chron
end

using Statistics, StatsBase, DelimitedFiles, SpecialFunctions
using Plots; gr(); default(fmt = :png)

Enter sample information

Paste your data in here!

In [2]:
# Input the number of samples we wish to model (must match below)
nSamples = 4

# Make an instance of a ChronSection object for nSamples
smpl = NewChronAgeData(nSamples)
smpl.Name          = ("Sample 1", "Sample 2", "Sample 3", "Sample 4") # Et cetera
smpl.Age_14C       = [ 6991,  7088,  7230,  7540,] # Measured ages
smpl.Age_14C_sigma = [   30,    70,    50,    50,] # Measured 1-σ uncertainties
smpl.Height[:]     = [ -355,  -380,-397.0,-411.5,] # Depths below surface should be negative
smpl.Height_sigma[:]  = fill(0.01, nSamples) # Usually assume little or no sample height uncertainty
smpl.Age_Sidedness[:] = zeros(nSamples) # Sidedness (zeros by default: geochron constraints are two-sided). Use -1 for a maximum age and +1 for a minimum age, 0 for two-sided
smpl.Path = "MyData/" # Where do you want output files to be stored

smpl.Age_Unit = "Years BP"; # Unit of measurement for ages
smpl.Height_Unit = "m"; # Unit of measurement for Height and Height_sigma

Note that smpl.Height must increase with increasing stratigraphic height -- i.e., stratigraphically younger samples must be more positive. For this reason, it is convenient to represent depths below surface as negative numbers.


Calculate calendar age PDFs for each sample

Using the IntCal 2013 radiocarbon calibration curve

In [3]:
smpl.Params = fill(NaN, length(intcal13["Age_Calendar"]), nSamples)
hdl = plot(layout=nSamples)
for i = 1:nSamples
    # The likelihood that a measured 14C age could result from a sample of
    # a given calendar age is proportional to the intergral of the product
    # of the two respective distributions
    likelihood = normproduct.(smpl.Age_14C[i], smpl.Age_14C_sigma[i], intcal13["Age_14C"], intcal13["Age_sigma"])
    likelihood ./= sum(likelihood) * intcal13["dt"] # Normalize

    # Estimate mean and standard deviation by drawing samples from distribution
    samples = draw_from_distribution(likelihood, 10^6) .* maximum(intcal13["Age_Calendar"])
    smpl.Age[i] = mean(samples)
    smpl.Age_sigma[i] = std(samples)
    smpl.Age_025CI[i] = percentile(samples,2.5)
    smpl.Age_975CI[i] = percentile(samples,97.5)

    # Populate smpl.Params with log likelihood for each sample
    smpl.Params[:,i] = normproduct_ll.(smpl.Age_14C[i], smpl.Age_14C_sigma[i], intcal13["Age_14C"], intcal13["Age_sigma"])

    # Plot likelihood vector for each sample
    t = (intcal13["Age_Calendar"] .> smpl.Age[i] - 5*smpl.Age_sigma[i]) .& (intcal13["Age_Calendar"] .< smpl.Age[i] + 5*smpl.Age_sigma[i])
    plot!(hdl[i], intcal13["Age_Calendar"][t], likelihood[t], label="", xlabel="Calendar Age", ylabel="Likelihood", title="$(smpl.Name[i])")
end
display(hdl)
7600 7700 7800 7900 8000 0.0000 0.0025 0.0050 0.0075 0.0100 Sample 1 Calendar Age Likelihood 7600 7700 7800 7900 8000 8100 8200 0.000 0.001 0.002 0.003 0.004 0.005 0.006 Sample 2 Calendar Age Likelihood 7800 7900 8000 8100 8200 8300 0.000 0.002 0.004 0.006 Sample 3 Calendar Age Likelihood 8100 8200 8300 8400 8500 8600 0.0000 0.0025 0.0050 0.0075 0.0100 Sample 4 Calendar Age Likelihood

Configure and run stratigraphic model

note: to spare Binder's servers, this demo uses

config.nsteps = 3000 
config.burnin = 2000*npoints_approx

However, you probably want higher numbers for a publication-quality result, for instance

config.nsteps = 30000 # Number of steps to run in distribution MCMC
config.burnin = 10000*npoints_approx # Number to discard

and examine the log likelihood plot to make sure you've converged.

To run the stratigraphic MCMC model, we call the StratMetropolis14C function. If you're not using radiocarbon ages and want to run stratigraphic model with Gaussian mean age and standard deviation for some number of stratigraphic horizons, then you can set smpl.Age and smpl.Age_sigma directly, but then you'll need to call StratMetropolis instead of StratMetropolis14C

In [4]:
# Configure the stratigraphic Monte Carlo model
config = NewStratAgeModelConfiguration()
# If in doubt, you can probably leave these parameters as-is
config.resolution = 0.2 # Same units as sample height. Smaller is slower!
config.bounding = 0.5 # how far away do we place runaway bounds, as a fraction of total section height
(bottom, top) = extrema(smpl.Height)
npoints_approx = round(Int,length(bottom:config.resolution:top) * (1 + 2*config.bounding))
config.nsteps = 3000 # Number of steps to run in distribution MCMC
config.burnin = 2000*npoints_approx # Number to discard
config.sieve = round(Int,npoints_approx) # Record one out of every nsieve steps

# Run the stratigraphic MCMC model
(mdl, agedist, lldist) = StratMetropolis14C(smpl, config)

# Write the results to file
run(`mkdir -p $(smpl.Path)`) # Make sure that the path exists
writedlm(smpl.Path*"agedist.csv", agedist, ',') # Stationary distribution of the age-depth model
writedlm(smpl.Path*"height.csv", mdl.Height, ',') # Stratigraphic heights corresponding to each row of agedist
writedlm(smpl.Path*"lldist.csv", lldist, ',') # Log likelihood distribution (to check for stationarity)

# Plot the log likelihood to make sure we're converged (n.b burnin isn't recorded)
hdl = plot(lldist,xlabel="Step number",ylabel="Log likelihood",label="",line=(0.85,:darkblue))
savefig(hdl,smpl.Path*"lldist.pdf")
display(hdl)
Generating stratigraphic age-depth model...
Burn-in: 680000 steps
Burn-in...100%|█████████████████████████████████████████| Time: 0:12:54
Collecting sieved stationary distribution: 1020000 steps
Collecting...100%|██████████████████████████████████████| Time: 0:18:54
0 1000 2000 3000 0 5 10 15 20 Step number Log likelihood

Plot results

In [5]:
# Plot results (mean and 95% confidence interval for both model and data)
hdl = plot([mdl.Age_025CI; reverse(mdl.Age_975CI)],[mdl.Height; reverse(mdl.Height)], fill=(minimum(mdl.Height),0.5,:blue), label="model")
plot!(hdl, mdl.Age, mdl.Height, linecolor=:blue, label="")
plot!(hdl, smpl.Age, smpl.Height, xerror=(smpl.Age-smpl.Age_025CI,smpl.Age_975CI-smpl.Age),label="data",seriestype=:scatter,color=:black)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Height ($(smpl.Height_Unit))")
savefig(hdl,smpl.Path*"AgeDepthModel.svg")
savefig(hdl,smpl.Path*"AgeDepthModel.pdf")
display(hdl)
7800 7900 8000 8100 8200 8300 8400 -410 -400 -390 -380 -370 -360 Age (Years BP) Height (m) model data
In [6]:
# Stratigraphic height at which to interpolate
height = -404

age_at_height = linterp1s(mdl.Height,mdl.Age,height)
age_at_height_min = linterp1s(mdl.Height,mdl.Age_025CI,height)
age_at_height_max = linterp1s(mdl.Height,mdl.Age_975CI,height)
print("Interpolated age at height=$height: $age_at_height +$(age_at_height_max-age_at_height)/-$(age_at_height-age_at_height_min) $(smpl.Age_Unit)")

# Optional: interpolate full age distribution
interpolated_distribution = Array{Float64}(undef,size(agedist,2))
for i=1:size(agedist,2)
    interpolated_distribution[i] = linterp1s(mdl.Height,agedist[:,i],height)
end
hdl = histogram(interpolated_distribution, nbins=50, fill=(0.75,:darkblue), label="")
plot!(hdl, xlabel="Age ($(smpl.Age_Unit)) at height=$height", ylabel="Likelihood (unnormalized)")
Interpolated age at height=-404: 8213.878377486633 +145.27607341744988/-152.82619915759005 Years BP
Out[6]:
8000 8100 8200 8300 8400 0 50 100 150 Age (Years BP) at height=-404 Likelihood (unnormalized)

There are other things we can plot as well, such as deposition rate:

In [7]:
# Set bin width and spacing
binwidth = round(nanrange(mdl.Age)/10,sigdigits=1) # Can also set manually, commented out below
# binwidth = 100 # Same units as smpl.Age
binoverlap = 10
ages = collect(minimum(mdl.Age):binwidth/binoverlap:maximum(mdl.Age))
bincenters = ages[1+Int(binoverlap/2):end-Int(binoverlap/2)]
spacing = binoverlap

# Calculate rates for the stratigraphy of each markov chain step
dhdt_dist = Array{Float64}(undef, length(ages)-binoverlap, config.nsteps)
@time for i=1:config.nsteps
    heights = linterp1(reverse(agedist[:,i]), reverse(mdl.Height), ages)
    dhdt_dist[:,i] .= abs.(heights[1:end-spacing] - heights[spacing+1:end]) ./ binwidth
end

# Find mean and 1-sigma (68%) CI
dhdt = nanmean(dhdt_dist,dim=2)
dhdt_50p = nanmedian(dhdt_dist,dim=2)
dhdt_16p = nanpctile(dhdt_dist,15.865,dim=2) # Minus 1-sigma (15.865th percentile)
dhdt_84p = nanpctile(dhdt_dist,84.135,dim=2) # Plus 1-sigma (84.135th percentile)
# Other confidence intervals (10:10:50)
dhdt_20p = nanpctile(dhdt_dist,20,dim=2)
dhdt_80p = nanpctile(dhdt_dist,80,dim=2)
dhdt_25p = nanpctile(dhdt_dist,25,dim=2)
dhdt_75p = nanpctile(dhdt_dist,75,dim=2)
dhdt_30p = nanpctile(dhdt_dist,30,dim=2)
dhdt_70p = nanpctile(dhdt_dist,70,dim=2)
dhdt_35p = nanpctile(dhdt_dist,35,dim=2)
dhdt_65p = nanpctile(dhdt_dist,65,dim=2)
dhdt_40p = nanpctile(dhdt_dist,40,dim=2)
dhdt_60p = nanpctile(dhdt_dist,60,dim=2)
dhdt_45p = nanpctile(dhdt_dist,45,dim=2)
dhdt_55p = nanpctile(dhdt_dist,55,dim=2)

# Plot results
hdl = plot(bincenters,dhdt, label="Mean", color=:black, linewidth=2)
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_16p; reverse(dhdt_84p)], fill=(0,0.2,:darkblue), linealpha=0, label="68% CI")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_20p; reverse(dhdt_80p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_25p; reverse(dhdt_75p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_30p; reverse(dhdt_70p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_35p; reverse(dhdt_65p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_40p; reverse(dhdt_60p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,[bincenters; reverse(bincenters)],[dhdt_45p; reverse(dhdt_55p)], fill=(0,0.2,:darkblue), linealpha=0, label="")
plot!(hdl,bincenters,dhdt_50p, label="Median", color=:grey, linewidth=1)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Depositional Rate ($(smpl.Height_Unit) / $(smpl.Age_Unit) over $binwidth $(smpl.Age_Unit))", fg_color_legend=:white)
savefig(hdl,smpl.Path*"DepositionRateModelCI.pdf")
display(hdl)
7900 8000 8100 8200 0.00 0.05 0.10 0.15 0.20 0.25 Age (Years BP) Depositional Rate (m / Years BP over 100 Years BP) Mean 68% CI Median
  0.197671 seconds (600.74 k allocations: 54.665 MiB, 11.02% gc time)

Stratigraphic model including hiatuses

We can also deal with discrete hiatuses in the stratigraphic section if we know where they are and about how long they lasted. We need some different models and methods though. In particular, in addition to the ChronAgeData struct, we also need a HiatusData struct now, and will have a new hiatusdist output.

In [8]:
config.nsteps = 3000 # Number of steps to run in distribution MCMC
config.burnin = 2000*npoints_approx # Number to discard
# Data about hiatuses
nHiatuses = 2 # The number of hiatuses you have data for
hiatus = NewHiatusData(nHiatuses) # Struct to hold data
hiatus.Height         = [-371.5, -405.0 ]
hiatus.Height_sigma   = [   0.0,    0.0 ]
hiatus.Duration       = [ 100.0,   123.0]
hiatus.Duration_sigma = [  30.5,    20.0]

# Run the model. Note the new `hiatus` argument and `hiatusdist` result
(mdl, agedist, hiatusdist, lldist) = StratMetropolis14C(smpl, hiatus, config); sleep(0.5)

# Plot results (mean and 95% confidence interval for both model and data)
hdl = plot([mdl.Age_025CI; reverse(mdl.Age_975CI)],[mdl.Height; reverse(mdl.Height)], fill=(minimum(mdl.Height),0.5,:blue), label="model")
plot!(hdl, mdl.Age, mdl.Height, linecolor=:blue, label="", fg_color_legend=:white)
plot!(hdl, smpl.Age, smpl.Height, xerror=(smpl.Age-smpl.Age_025CI,smpl.Age_975CI-smpl.Age),label="data",seriestype=:scatter,color=:black)
plot!(hdl, xlabel="Age ($(smpl.Age_Unit))", ylabel="Height ($(smpl.Height_Unit))")
# savefig(hdl,smpl.Path*"AgeDepthModelHiatus.pdf");
display(hdl)
Generating stratigraphic age-depth model...
Burn-in: 680000 steps
Burn-in...100%|█████████████████████████████████████████| Time: 0:13:39
Collecting sieved stationary distribution: 1020000 steps
Collecting...100%|██████████████████████████████████████| Time: 0:21:35
7800 7900 8000 8100 8200 8300 8400 -410 -400 -390 -380 -370 -360 Age (Years BP) Height (m) model data

Getting your data out

As shown below, we have access to the system command line, so we can use the unix command ls to see all the files we have written. We'll try this first using ; to access Julia's command-line shell mode:

In [9]:
; ls MyData
AgeDepthModel.pdf
AgeDepthModel.svg
AgeDepthModelHiatus.pdf
DepositionRateModelCI.pdf
agedist.csv
height.csv
lldist.csv
lldist.pdf

Alternatively, we could do the same thing using the run function:

In [10]:
run(`ls MyData`)
AgeDepthModel.pdf
AgeDepthModel.svg
AgeDepthModelHiatus.pdf
DepositionRateModelCI.pdf
agedist.csv
height.csv
lldist.csv
lldist.pdf
Out[10]:
Process(`ls MyData`, ProcessExited(0))

If you're running this as an online notebook, getting these files out of here may be harder though. For SVG files, one option is to view them in markdown cells, which you should then be able to right click and download as real vector graphics. e.g. pasting something like

<img src="AgeDepthModel.svg" align="left" width="600"/>

in a markdown cell such as this one

Meanwhile, for the csv files we could try something like ; cat agedist.csv, but agedist is probably too big to print. Let's try using ffsend instead, which should give you a download link. In fact, while we're at it, we might as well archive and zip the entire directory!

In [11]:
# Make gzipped tar archive of the the whole MyData directory
run(`tar -zcf archive.tar.gz ./MyData`);
In [14]:
# Download prebuilt ffsend linux binary
download("https://github.com/timvisee/ffsend/releases/download/v0.2.65/ffsend-v0.2.65-linux-x64-static", "ffsend")

# Make ffsend executable
run(`chmod +x ffsend`);
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   622    0   622    0     0   1718      0 --:--:-- --:--:-- --:--:--  1713
100 10.8M  100 10.8M    0     0  3750k      0  0:00:02  0:00:02 --:--:-- 6418k
In [15]:
; ./ffsend upload archive.tar.gz

You could alternatively use the ffsend command in this way to transfer individual files, for instance ; ./ffsend upload MyData/agedist.csv

In [16]:

Keep in mind that any changes you make to this online notebook won't persist after you close the tab (or after it times out) even if you save your changes! You have to either copy-paste or file>Download as a copy.


DOI