Notebook

In [1]:

import CSV
import TypedTables: Table, columnnames
import Printf: @sprintf, @printf # For converting numbers to strings
import LinearAlgebra: BLAS
BLAS.set_num_threads(1) # BFGS uses BLAS operations, but they are relatively small so it's actually more efficient to run BLAS single-threaded.

Load the isochrone table using CSV.read into a TypedTables.Table structure. This contains an isochrone table from PARSEC that has 26 steps in [M/H] and 71 steps in logAge=log10(age [yr]) for a total of 1846 unique isochrones. For this example, this isochrone table actually has more steps in logAge than we want to use, so we will only use a subset of the available isochrones. In particular, the spacing of ΔlogAge=0.05 dex results in the most recent time bins (logAge < 8) having too little stellar mass to measure robustly.

In [2]:

table = CSV.read("../../../../Work/Resources/isochrones/parsec/jwst/nircam_nov22/table.dat", Table; comment="#", delim=' ', ignorerepeated=true, 
    header=["Zini", "MH", "logAge", "Mini", "int_IMF", "Mass", "logL", "logTe", "logg", "label", "McoreTP", "C_O", "period0", "period1", "period2", 
        "period3", "period4", "pmode", "Mloss", "tau1m", "X",  "Y", "Xc",  "Xn",  "Xo",  "Cexcess",  "Z", "mbolmag",  "F070W",  "F090W",  "F115W",  
        "F150W",  "F200W",  "F277W",  "F356W",  "F444W",  "F150W2",  "F322W2",  "F140M",  "F162M",  "F182M",  "F210M",  "F250M",  "F300M",  "F335M",  
        "F360M",  "F410M",  "F430M",  "F460M",  "F480M"])

Out[2]:

Table with 50 columns and 709192 rows:
      Zini       MH        logAge  Mini      int_IMF  Mass   logL    logTe   ⋯
    ┌─────────────────────────────────────────────────────────────────────────
 1  │ 2.4652e-5  -2.19174  6.6     0.1       1.17111  0.1    -1.436  3.5769  ⋯
 2  │ 2.4652e-5  -2.19174  6.6     0.105145  1.2127   0.105  -1.411  3.5791  ⋯
 3  │ 2.4652e-5  -2.19174  6.6     0.109821  1.24828  0.109  -1.387  3.5809  ⋯
 4  │ 2.4652e-5  -2.19174  6.6     0.124     1.34512  0.124  -1.321  3.5865  ⋯
 5  │ 2.4652e-5  -2.19174  6.6     0.129464  1.37867  0.129  -1.298  3.5885  ⋯
 6  │ 2.4652e-5  -2.19174  6.6     0.140418  1.4407   0.14   -1.252  3.5923  ⋯
 7  │ 2.4652e-5  -2.19174  6.6     0.149107  1.4856   0.149  -1.219  3.595   ⋯
 8  │ 2.4652e-5  -2.19174  6.6     0.183261  1.63379  0.182  -1.1    3.6048  ⋯
 9  │ 2.4652e-5  -2.19174  6.6     0.197245  1.68445  0.196  -1.057  3.6085  ⋯
 10 │ 2.4652e-5  -2.19174  6.6     0.207292  1.71805  0.206  -1.029  3.6111  ⋯
 11 │ 2.4652e-5  -2.19174  6.6     0.210274  1.72762  0.209  -1.021  3.6119  ⋯
 12 │ 2.4652e-5  -2.19174  6.6     0.225868  1.77492  0.225  -0.977  3.6159  ⋯
 13 │ 2.4652e-5  -2.19174  6.6     0.246817  1.83218  0.246  -0.925  3.621   ⋯
 14 │ 2.4652e-5  -2.19174  6.6     0.261505  1.86868  0.261  -0.893  3.6246  ⋯
 15 │ 2.4652e-5  -2.19174  6.6     0.285683  1.92332  0.285  -0.84   3.6305  ⋯
 16 │ 2.4652e-5  -2.19174  6.6     0.307343  1.96739  0.307  -0.796  3.6357  ⋯
 17 │ 2.4652e-5  -2.19174  6.6     0.33773   2.02284  0.337  -0.737  3.6427  ⋯
 18 │ 2.4652e-5  -2.19174  6.6     0.355357  2.05211  0.355  -0.705  3.6465  ⋯
 19 │ 2.4652e-5  -2.19174  6.6     0.366732  2.07002  0.366  -0.685  3.6492  ⋯
 20 │ 2.4652e-5  -2.19174  6.6     0.404464  2.12461  0.404  -0.62   3.6576  ⋯
 21 │ 2.4652e-5  -2.19174  6.6     0.436853  2.16644  0.436  -0.569  3.6643  ⋯
 22 │ 2.4652e-5  -2.19174  6.6     0.453571  2.18648  0.453  -0.544  3.6677  ⋯
 23 │ 2.4652e-5  -2.19174  6.6     0.457007  2.19048  0.457  -0.539  3.6683  ⋯
 ⋮  │     ⋮         ⋮        ⋮        ⋮         ⋮       ⋮      ⋮       ⋮     ⋱

Check the available columns in the table

In [3]:

columnnames(table)

Out[3]:

(:Zini, :MH, :logAge, :Mini, :int_IMF, :Mass, :logL, :logTe, :logg, :label, :McoreTP, :C_O, :period0, :period1, :period2, :period3, :period4, :pmode, :Mloss, :tau1m, :X, :Y, :Xc, :Xn, :Xo, :Cexcess, :Z, :mbolmag, :F070W, :F090W, :F115W, :F150W, :F200W, :F277W, :F356W, :F444W, :F150W2, :F322W2, :F140M, :F162M, :F182M, :F210M, :F250M, :F300M, :F335M, :F360M, :F410M, :F430M, :F460M, :F480M)

In [4]:

unique(table.logAge) # These are `log10(age)`, where the age is in years.

Out[4]:

71-element Vector{Float64}:
  6.6
  6.65
  6.7
  6.75
  6.8
  6.85
  6.9
  6.95
  7.0
  7.05
  7.1
  7.15
  7.2
  ⋮
  9.55001
  9.60001
  9.65001
  9.70001
  9.75001
  9.80001
  9.85001
  9.90001
  9.95001
 10.00001
 10.05001
 10.10001

For the most metal-poor PARSEC isochrones (e.g., -2.2 [M/H]), there are multiple individual isochrones with the same [M/H] but different Zini. This can be seen as length(unique(table.Zini)) > length(unique(table.MH)) below.

In [5]:

unique(table.MH) # These are metallicities, [M/H]

Out[5]:

26-element Vector{Float64}:
 -2.19174
 -2.1
 -2.0
 -1.9
 -1.8
 -1.7
 -1.6
 -1.5
 -1.4
 -1.3
 -1.2
 -1.1
 -1.0
 -0.9
 -0.8
 -0.7
 -0.6
 -0.5
 -0.4
 -0.3
 -0.2
 -0.1
  0.0
  0.1
  0.2
  0.3

In [6]:

unique(table.Zini)

Out[6]:

29-element Vector{Float64}:
 2.4652e-5
 3.1035e-5
 3.9069e-5
 4.9184e-5
 0.00012351
 0.00015547
 0.0001957
 0.00024632
 0.00031003
 0.00039019
 0.00049103
 0.00061788
 0.00077741
 ⋮
 0.0024432
 0.0030686
 0.0038518
 0.0048313
 0.0060543
 0.0075779
 0.0094713
 0.011816
 0.01471
 0.018261
 0.022594
 0.027842

For bookkeeping purposes, it is simplest for each isochrone to be identifiable by a unique [M/H], so we we will use the Zini values to calculate new [M/H] values. These are now initial metallicities.

In [7]:

import StarFormationHistories: MH_from_Z
# Rewrite [M/H] column. 
for i in eachindex(table)
    table.MH[i] = MH_from_Z(table.Zini[i], 0.01471)
end
unique(table.MH)

Precompiling StarFormationHistories
  ✓ StarFormationHistories
  ✓ StarFormationHistories → DataFramesExt
  ✓ StarFormationHistories → TypedTablesExt
  3 dependencies successfully precompiled in 6 seconds. 223 already precompiled.

Out[7]:

29-element Vector{Float64}:
 -2.800020899214084
 -2.7000110527475605
 -2.6000175983197713
 -2.5000097946519406
 -2.100012070548602
 -2.0000162211614816
 -1.9000073196768041
 -1.8000170782347085
 -1.7000105737090876
 -1.6000092093397091
 -1.5000150464034467
 -1.4000147226611754
 -1.3000117782255178
  ⋮
 -0.8000106239270766
 -0.7000142637672999
 -0.60001623283198
 -0.5000166309399301
 -0.40001148522876795
 -0.3000154566956319
 -0.20001378845202145
 -0.10003154653740931
  0.0
  0.09998742559427609
  0.1999864995269065
  0.299992911269662

Observational Model¶

We'll construct a model Hess diagram from the populations in this isochrone table. First we need to set up our observational models for photometric error and incompleteness. You would typically measure these from artificial star tests but we'll make up some models here using functions built into StarFormationHistories.jl.

In [8]:

import StarFormationHistories: Martin2016_complete, exp_photerr

In [9]:

distmod = 25.0 # Distance modulus 
F090W_complete(m) = Martin2016_complete(m,1.0,28.5,0.7)
F150W_complete(m) = Martin2016_complete(m,1.0,27.5,0.7)
F090W_error(m) = min( exp_photerr(m, 1.03, 15.0, 36.0, 0.02), 0.4 )
F150W_error(m) = min( exp_photerr(m, 1.03, 15.0, 35.0, 0.02), 0.4 );

We'll make a few plots showing the adopted models. Below we set up PyPlot.jl (a wrapper for Python's matplotlib package).

In [10]:

import PyPlot as plt
import PyPlot: @L_str # For LatexStrings
import PyCall: @pyimport
plt.rc("text", usetex=true)
plt.rc("text.latex", preamble="\\usepackage{amsmath}") # for \text
plt.rc("font", family="serif", serif=["Computer Modern"], size=14)
plt.rc("figure", figsize=(5,5))
plt.rc("patch", linewidth=1, edgecolor="k", force_edgecolor=true)
# https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html
plt.rc("image", interpolation="none")

In [11]:

plotmags=20.0:0.05:33.0

fig,ax1=plt.subplots()
ax1.plot(plotmags,F090W_complete.(plotmags),label="F090W")
ax1.plot(plotmags,F150W_complete.(plotmags),label="F150W")
ax1.set_xlabel("Apparent Magnitude")
ax1.set_ylabel("Completeness")
ax1.legend(loc="lower left")

plotmags=20.0:0.05:33.0
fig,ax1=plt.subplots()
ax1.plot(plotmags,F090W_error.(plotmags),label="F090W")
ax1.plot(plotmags,F150W_error.(plotmags),label="F150W")
ax1.set_xlabel("Apparent Magnitude")
ax1.set_ylabel("Photometric Error")
ax1.legend(loc="upper left")
ax1.set_ylim([0.0,ax1.get_ylim()[2]])

Out[11]:

(0.0, 0.41895850114873723)

We put a maximum on the magnitude error in the model because most of the time one would place a cut on the photometric catalog in either signal to noise (e.g., SNR>5) or magnitude error (e.g., error < 0.4). From a practical perspective, if we allow the photometric error model to grow too large, it will hurt the efficiency of partial_cmd_smooth for template creation later.

Template Construction¶

We may now define our mock star formation history. We will use partial_cmd_smooth to construct the templates for a Hess diagram model with F150W on the y-axis and F090W-F150W on the x-axis, with an initial mass function model from InitialMassFunctions.jl. We'll construct all 1846 templates, one for every isochrone in our table, but this may not always be necessary.

In [12]:

import StarFormationHistories: partial_cmd_smooth
import InitialMassFunctions: Kroupa2001

In [13]:

# Some additional setup
edges = (range(-0.65, 1.75, length=100), range(distmod-7.0, distmod+3.0, length=100)) # The bin edges for the Hess diagrams (17,28)
imf = Kroupa2001(0.08, 100.0) # Initial mass function model
unique_logAge = unique(table.logAge)
unique_MH = unique(table.MH)
template_norm = 1e3 # The stellar mass of the populations in each template

# Now define the subset of unique logAges for which isochrones are available
# that we actually want to use for our example.
# As stated above, the isochrone file we read in has a uniform grid spacing of 
# `ΔlogAge=0.05`. We will only use a spacing of 0.1 dex for logAge < 9 if isochrone_cut is true.
isochrone_cut::Bool = false
if isochrone_cut
    logAge_cutval = 9
    selection = vcat( findall( <=(logAge_cutval), unique_logAge)[begin:2:end],
                      findall( >(logAge_cutval), unique_logAge) )
    unique_logAge = unique_logAge[selection]
end

In [14]:

@printf("Number of final isochrones is: %i",length(unique_logAge) * length(unique_MH))

Number of final isochrones is: 2059

In [15]:

# # Constructing the templates; single-threaded with push!
# templates = Vector{Matrix{Float64}}(undef,0)
# template_logAge = Vector{Float64}(undef,0)
# template_MH = Vector{Float64}(undef,0)
# # We don't strictly need to save the initial masses (m_ini) or the isochrone magnitudes (iso_mags), 
# # but we'll use them later in this example so we'll save them now. 
# template_mini = Vector{Vector{Float64}}(undef,0)
# template_isomags = Vector{Vector{Vector{Float64}}}(undef,0)
# for logage in unique_logAge 
#     for mh in unique_MH
#         # Combination of a logage and an MH defines a unique isochrone.
#         # Pick out all entries in `table` that match this combination; these will be row indices
#         # into the table of the stars in this specific isochrone.
#         local good = findall( (table.MH .== mh) .& (table.logAge .== logage) )
#         # Chop off the last entry in the isochrone because its the 30 mag weird thing that parsec does.
#         local m_ini = table.Mini[good][begin:end-1] # These are the initial masses of the stars in the isochrone, in solar masses.
#         push!(template_mini, m_ini)
#         local iso_mags = [table.F090W[good][begin:end-1], table.F150W[good][begin:end-1]] # These are the absolute magnitudes we want from the isochrone.
#         push!(template_isomags, iso_mags)
#         # Create template and push
#         push!(templates, partial_cmd_smooth( m_ini, iso_mags, [F090W_error, F150W_error], 2, [1,2], imf, [F090W_complete, F150W_complete]; 
#                 dmod=distmod, normalize_value=template_norm, edges=edges).weights )
#         push!(template_logAge, logage)
#         push!(template_MH, mh)
#     end
# end

In [16]:

# Constructing the templates; multi-threaded with pre-allocated arrays
templates = Vector{Matrix{Float64}}(undef, length(unique_logAge) * length(unique_MH))
template_logAge = Vector{Float64}(undef, length(templates))
template_MH = Vector{Float64}(undef, length(templates))
# We don't strictly need to save the initial masses (m_ini) or the isochrone magnitudes (iso_mags), 
# but we'll use them later in this example so we'll save them now. 
template_mini = Vector{Vector{Float64}}(undef, length(templates))
template_isomags = Vector{Vector{Vector{Float64}}}(undef, length(templates))
# Multi-threaded, nested loop iterating over unique_MH (outer loop) and unique_logAge (inner loop). 
# Using `enumerate` also gets us index `i` of iteration.
Base.Threads.@threads for (i, (mh, logage)) in collect(enumerate(Iterators.product(unique_MH, unique_logAge)))
    # Combination of a logage and an MH defines a unique isochrone.
    # Pick out all entries in `table` that match this combination; these will be row indices
    # into the table of the stars in this specific isochrone.
    local good = findall( (table.MH .== mh) .& (table.logAge .== logage) )
    # Chop off the last star because its the 30 mag weird thing that parsec does.
    local m_ini = table.Mini[good][begin:end-1] # These are the initial masses of the stars in the isochrone, in solar masses.
    template_mini[i] = m_ini
    local iso_mags = [table.F090W[good][begin:end-1], table.F150W[good][begin:end-1]] # These are the absolute magnitudes we want from the isochrone.
    template_isomags[i] = iso_mags
    # Create templates
    templates[i] = partial_cmd_smooth(m_ini, iso_mags, [F090W_error, F150W_error], 2, [1,2], imf, [F090W_complete, F150W_complete]; 
                                      dmod=distmod, normalize_value=template_norm, edges=edges).weights
    template_logAge[i] = logage
    template_MH[i] = mh
end

In [17]:

# Write single isochrone to file
# Some extra info here https://discourse.julialang.org/t/formatting-float64-output-with-csv-write/23479
# let idx = length(unique_MH)*70 + findfirst(==(first(unique_MH)), unique_MH) # 1821
let idx = length(unique_MH)*(findmin( x -> abs(exp10(x-9) - 12.59), unique_logAge)[2] - 1) + findfirst(==(first(unique_MH)), unique_MH) # 1821
    logage = template_logAge[idx]
    mh = template_MH[idx]
    println(logage, " ", mh)
    # local good = findall((table.MH .== mh) .& (table.logAge .== logage))
    # open("isochrone.txt","w") do file
    #     print(file, "m_ini F090W F150W F277W\n")
    #     for i in good[begin:end-1]
    #         @printf(file, "%.10f %.3f %.3f %.3f\n", table.Mini[i], table.F090W[i], table.F150W[i], table.F277W[i])
    #     end
    # end
end

10.10001 -2.800020899214084

In [18]:

# Sort the template_logAge and template_MH so we have a guarantee for later
permidx = sortperm(template_logAge)
template_logAge = template_logAge[permidx]
template_MH = template_MH[permidx]
templates = templates[permidx];

We'll plot a few templates for reference.

In [19]:

# When displaying the templates we will divide them by template_norm so that that units of the templates are
# [N / M_sun], or expected number of stars per bin per solar mass of stars formed.
# Plotting in absolute magnitude here rather than apparent on y-axis
# display_templates = (100,1495,1821) # Before option of using isochrone_cut
display_templates = ( length(unique_MH)*(findmin( x -> abs(x - 6.8), unique_logAge)[2] - 1) + 
                              findmin( x -> abs(x - (-0.1)), unique_MH)[2],
                     length(unique_MH)*(findmin( x -> abs(exp10(x-9) - 2.82), unique_logAge)[2] - 1) + 
                                        findmin( x -> abs(x - (-1.0)), unique_MH)[2],
                      length(unique_MH)*(findmin( x -> abs(exp10(x-9) - 12.59), unique_logAge)[2] - 1) + 
                                         findmin( x -> abs(x - (-2.8)), unique_MH)[2] )
                                         # findfirst(==(first(unique_MH)), unique_MH) )

fig,axs=plt.subplots(1,3,sharey=true,sharex=true,figsize=(15,5))
fig.subplots_adjust(hspace=0.0,wspace=0.075)

im1 = axs[1].imshow(permutedims(templates[display_templates[1]]) ./ template_norm, origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2] .- distmod)...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=1e-8), rasterized=true) 
im2 = axs[2].imshow(permutedims(templates[display_templates[2]]) ./ template_norm, origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2] .- distmod)...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=1e-8), rasterized=true) 
im3 = axs[3].imshow(permutedims(templates[display_templates[3]]) ./ template_norm, origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2] .- distmod)...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=1e-8), rasterized=true) 
# im1.set_clim( [0.5, im1.get_clim()[2]] )
for ax in axs
    ax.set_xlabel(L"F090W$-$F150W")
end
axs[1].set_ylabel("F150W")
axs[1].set_ylim(reverse(extrema(edges[2])).-distmod) 
axs[1].set_xlim(extrema(edges[1]))
# Label the templates with their ages and metallicities
axs[1].text(0.95,0.95,(@sprintf "Age: %.2f Gyr \n [M/H]: %.1f" exp10(template_logAge[display_templates[1]])/1e9 template_MH[display_templates[1]]), transform=axs[1].transAxes,ha="right",va="top")
axs[2].text(0.1,0.95,(@sprintf "Age: %.2f Gyr \n [M/H]: %.1f" exp10(template_logAge[display_templates[2]])/1e9 template_MH[display_templates[2]]), transform=axs[2].transAxes,ha="left",va="top")
axs[3].text(0.1,0.95,(@sprintf "Age: %.2f Gyr \n [M/H]: %.1f" exp10(template_logAge[display_templates[3]])/1e9 template_MH[display_templates[3]]), transform=axs[3].transAxes,ha="left",va="top")
fig.colorbar(im1, ax=axs[1], pad=0.01)
fig.colorbar(im2, ax=axs[2], pad=0.01)
fig.colorbar(im3, ax=axs[3], pad=0.01)

# Optionally, plot photometric error bars along the left. Errors are small in this model so they don't show very well.
plot_errors::Bool = false
if plot_errors
    # Set up points to show the photometric errors
    error_points = minimum(edges[2])+1.0:0.5:maximum(edges[2])-0.5
    error_points_x = repeat([minimum(edges[1])+0.15], length(error_points))
    
    # Plot the photometric error points
    for ax in axs
        ax.errorbar(error_points_x, error_points, yerr=F150W_error.(error_points), xerr=sqrt.(F090W_error.(error_points_x .+ error_points).^2 .+ F150W_error.(error_points).^2), ls="") # , fmt='', ecolor=None, elinewidth=None, capsize=None
    end
end

# Optionally, overplot the isochrones themselves for comparison
plot_isochrones::Bool = true
if plot_isochrones
    alphas = (0.5,0.5,0.5)
    for (i,t_i) in enumerate(display_templates)
        local good = findall( (table.MH .== template_MH[t_i]) .& (table.logAge .== template_logAge[t_i]) )
        local iso_mags = [table.F090W[good][begin:end-1], table.F150W[good][begin:end-1]] # These are the absolute magnitudes we want from the isochrone. 
        axs[i].scatter(iso_mags[1] .- iso_mags[2], iso_mags[2],marker=".", c="orange", s=1, alpha=alphas[i])
    end
end
# plt.savefig("template_example.pdf",bbox_inches="tight")

Building a Model Hess Diagram¶

We may now use construct_x0_mdf and calculate_coeffs_mdf to create a mock star formation history with constant star formation rate and a reasonable metallicity evolution for a dwarf galaxy.

In [20]:

import StarFormationHistories: construct_x0_mdf, calculate_coeffs_mdf

In [21]:

# Set overall stellar mass of complex population and metallicity evolution variables
stellar_mass = 1e7
α, β, σ = 0.1, -1.87, 0.1 # -0.1, -0.5, 0.1
# max_logage sets the highest bin edge for the template_logAge
# if template_logage was [6.6,6.7,6.8], I would set max_logAge=6.9, 
# representing the rightmost bin edge. In our case, our oldest template
# is quite old, so we'll make it either the Hubble time or the maximum of 
# template_logAge plus the isochrone logage grid spacing, which is 0.05.
# max_logAge = min(log10(13.7e9), maximum(template_logAge) + 0.05)
T_max = min(13.7, exp10(maximum(unique_logAge)+ 0.05)/1e9)
max_logAge = log10(T_max * 1e9)
# Get initial guess for stellar mass coefficients.
# We have to divide `stellar_mass` by `template_norm` here because `template_norm` is the total amount of stellar mass in each template, 
# since we passed `normalize_value=template_norm` when constructing the templates with `partial_cmd_smooth`. As such, when computing 
# coefficients, we need to normalize out the adopted `normalize_value` for the templates.
x0_mdf = construct_x0_mdf(template_logAge, T_max; normalize_value=stellar_mass / template_norm)

Out[21]:

71-element Vector{Float64}:
   0.3546754852763317
   0.39795243976854205
   0.44650998135479464
   0.5009924391101048
   0.5621227621472895
   0.6307121127124647
   0.7076706298151519
   0.7940195062200948
   0.8909045390546783
   0.9996113338407332
   1.1215823636987226
   1.2584361100895634
   1.4119885390804652
   ⋮
 316.1121376527206
 354.68365207520725
 397.9616030676007
 446.5202627454242
 501.00397502014556
 562.1357056512223
 630.726635562751
 707.6869247211959
 794.0377894053578
 890.9250531259797
 999.6343510072705
 810.7874311842759

In [22]:

# Check sum of x0_mdf * template_norm; should equal stellar_mass
sum(x0_mdf) * template_norm

Out[22]:

1.0000000000000002e7

The above coefficients are the amount of stellar mass formed in each age bin of unique(template_logAge). To construct the complex Hess diagram, we need individual per-template weights, which we can get with calculate_coeffs_mdf.

In [23]:

x0 = calculate_coeffs_mdf(x0_mdf, template_logAge, template_MH, T_max, α, β, σ)

Out[23]:

2059-element Vector{Float64}:
 2.0774487661255844e-116
 1.2259099971069627e-106
 2.651625167725607e-97
 2.1162850803717393e-88
 3.8714448827814896e-57
 2.0772592204779136e-50
 4.1081211720955453e-44
 2.981060528961156e-38
 7.975095776546787e-33
 7.84340370564942e-28
 2.8356607021243346e-23
 3.773991167821277e-19
 1.848166988977186e-15
 ⋮
 3.479788626121185e-18
 1.445217171479717e-22
 2.2078339187838887e-27
 1.2406132585250885e-32
 2.562643689989885e-38
 1.9497482102204798e-44
 5.452893528736057e-51
 5.628156079458223e-58
 2.119330307269925e-65
 2.9581805702825195e-73
 1.5159136200760106e-81
 2.853524813787372e-90

Let's plot the cumulative SFH and mean [M/H] of our model population.

In [24]:

import StarFormationHistories: calculate_cum_sfr
unique_template_logAge, cum_sfr_arr, sfr_arr, mean_mh_arr = calculate_cum_sfr(x0, template_logAge, template_MH, T_max; normalize_value=template_norm)

Out[24]:

([6.6, 6.65, 6.7, 6.75, 6.8, 6.85, 6.9, 6.95, 7.0, 7.05  …  9.65001, 9.70001, 9.75001, 9.80001, 9.85001, 9.90001, 9.95001, 10.00001, 10.05001, 10.10001], [1.0, 0.9999645324514724, 0.9999247372074956, 0.9998800862093601, 0.9998299869654491, 0.9997737746892345, 0.9997107034779632, 0.9996399364149818, 0.9995605344643599, 0.9994714440104544  …  0.6741419731491225, 0.6343458128423624, 0.58969378656782, 0.5395933890658053, 0.48337981850068307, 0.42030715494440796, 0.3495384624722884, 0.27013468353175263, 0.1810421782191546, 0.08107874311842758], [0.0007301391778410239, 0.0007301391778410237, 0.0007301391778410242, 0.000730139177841024, 0.0007301391778410239, 0.0007301391778410237, 0.0007301391778410242, 0.0007301391778410241, 0.0007301391778410239, 0.0007301391778410242  …  0.0007301391778410242, 0.000730139177841024, 0.0007301391778410241, 0.000730139177841024, 0.000730139177841024, 0.0007301391778410241, 0.000730139177841024, 0.0007301391778410241, 0.0007301391778410239, 0.0007301391778410239], [-0.5003989802960739, -0.5004475580599371, -0.5005020632066869, -0.5005632189862201, -0.5006318368982151, -0.5007088274602199, -0.5007952122896429, -0.5008921376599632, -0.5010008897110452, -0.5011229115153801  …  -0.9466925595632898, -1.0011998230888994, -1.0623590537122782, -1.1309764552665305, -1.2079609175110564, -1.2943433956348684, -1.391272022482317, -1.5000242679783078, -1.6220417785243162, -1.7589454984711883])

In [25]:

fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=false,figsize=(10,5))
fig.subplots_adjust(hspace=0.0,wspace=0.35)

axs[1].plot( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(cum_sfr_arr, 0.0), label="Input SFH" )

axs[1].set_xlim([T_max,-0.1])
axs[1].set_ylim([0.0,1.1])
axs[1].set_xlabel("Lookback Time [Gyr]")
axs[1].set_ylabel("Cumulative SF")
axs[1].legend()

axs[2].plot( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(mean_mh_arr, β), label="Input SFH" )
axs[2].set_xlabel("Lookback Time [Gyr]")
axs[2].set_ylabel(L"$\langle$[M/H]$\rangle$")

Out[25]:

PyObject Text(457.3650849792141, 0.5, '$\\langle$[M/H]$\\rangle$')

In [26]:

# Look at the unique values of the SFRs
unique(sfr_arr)

Out[26]:

7-element Vector{Float64}:
 0.0007301391778410239
 0.0007301391778410237
 0.0007301391778410242
 0.000730139177841024
 0.0007301391778410241
 0.0007301391778410238
 0.0007301391778410244

In [27]:

fig,ax1 = plt.subplots()
ax1.bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFH")
ax1.set_xlabel(L"log$_{10}$(Age [yr])")
ax1.set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
ax1.set_ylim([0.0, ax1.get_ylim()[2]])
ax1.legend()

Out[27]:

PyObject <matplotlib.legend.Legend object at 0x7fc93f0d2620>

Now let's make a model Hess diagram with this SFH. We can simply write composite = sum( coeffs .* models) if we want, but we also provide the composite! function that is more efficient.

In [28]:

model1 = sum( x0 .* templates)

Out[28]:

99×99 Matrix{Float64}:
 0.0          0.0          0.0          …  0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0          …  0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 4.7461e-105  1.2101e-96   2.13242e-96     0.0          0.0
 1.62227e-8   4.39528e-8   9.23015e-8      0.0          0.0
 6.47261e-7   1.63819e-6   2.89898e-6   …  2.35262e-60  0.0
 1.30993e-5   2.78702e-5   4.50146e-5      6.37266e-28  1.51965e-30
 0.000131899  0.000234432  0.000354662     1.34935e-16  1.78572e-19
 ⋮                                      ⋱               
 0.0946179    0.126277     0.166761        7.13111e-5   5.41444e-5
 0.120871     0.161966     0.205216        5.60572e-5   4.21829e-5
 0.155129     0.208495     0.24635         4.12576e-5   3.09016e-5
 0.192861     0.255337     0.264476     …  2.84541e-5   2.12879e-5
 0.213438     0.259621     0.24085         1.84096e-5   1.37977e-5
 0.190311     0.216917     0.201787        1.11875e-5   8.41799e-6
 0.143899     0.167347     0.21799         6.39379e-6   4.83577e-6
 0.112542     0.145776     0.315974        3.44096e-6   2.61572e-6
 0.101706     0.162612     0.431167     …  1.74606e-6   1.33199e-6
 0.104666     0.218471     0.522809        8.36611e-7   6.38481e-7
 0.116075     0.283444     0.547689        3.7942e-7    2.88684e-7
 0.141261     0.342103     0.502006        1.64316e-7   1.25094e-7

In [29]:

model2 = similar(model1)
import StarFormationHistories: composite!
composite!(model2, x0, templates)
model2

Out[29]:

99×99 Matrix{Float64}:
 0.0          0.0          0.0          …  0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0          …  0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 0.0          0.0          0.0             0.0          0.0
 4.7461e-105  1.2101e-96   2.13242e-96     0.0          0.0
 1.62227e-8   4.39528e-8   9.23015e-8      0.0          0.0
 6.47261e-7   1.63819e-6   2.89898e-6   …  2.35262e-60  0.0
 1.30993e-5   2.78702e-5   4.50146e-5      6.37266e-28  1.51965e-30
 0.000131899  0.000234432  0.000354662     1.34935e-16  1.78572e-19
 ⋮                                      ⋱               
 0.0946179    0.126277     0.166761        7.13111e-5   5.41444e-5
 0.120871     0.161966     0.205216        5.60572e-5   4.21829e-5
 0.155129     0.208495     0.24635         4.12576e-5   3.09016e-5
 0.192861     0.255337     0.264476     …  2.84541e-5   2.12879e-5
 0.213438     0.259621     0.24085         1.84096e-5   1.37977e-5
 0.190311     0.216917     0.201787        1.11875e-5   8.41799e-6
 0.143899     0.167347     0.21799         6.39379e-6   4.83577e-6
 0.112542     0.145776     0.315974        3.44096e-6   2.61572e-6
 0.101706     0.162612     0.431167     …  1.74606e-6   1.33199e-6
 0.104666     0.218471     0.522809        8.36611e-7   6.38481e-7
 0.116075     0.283444     0.547689        3.7942e-7    2.88684e-7
 0.141261     0.342103     0.502006        1.64316e-7   1.25094e-7

In [30]:

model1 ≈ model2

Out[30]:

true

Let's plot the smooth model.

In [31]:

fig,ax1=plt.subplots()

im1=ax1.imshow(permutedims(model2), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5), rasterized=true) 
ax1.set_xlabel(L"F090W$-$F150W")
ax1.set_ylabel("F150W")
ax1.set_ylim(reverse(extrema(edges[2]))) 
ax1.set_xlim(extrema(edges[1]))
fig.colorbar(im1)

Out[31]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc93f39a440>

Sampling Methods for CMDs and Hess Diagrams¶

The templates are constructed such that the 2D histogram pixel values give the expected number of stars in that bin for a population with total stellar mass normalize_value (in our case we defined the variable template_norm=1e3 and used that as our normalize_value for the templates). As such the pixel values of this complex model also represent expected numbers of stars per bin for our defined SFH. We may, therefore, simply Poisson sample it to obtain a reasonable observational realization of the Hess diagram:

In [32]:

import Distributions: Poisson
import Random: AbstractRNG, default_rng
function poisson_sample!(sample::T, model::S, rng::AbstractRNG=default_rng()) where {T <: AbstractArray{<:Number}, S <: AbstractArray{<:Number}}
    @assert axes(sample) == axes(model)
    for i in eachindex(sample, model)
        sample[i] = rand(rng,Poisson(model[i]))
    end
end
poisson_sample(model::AbstractArray{<:Number}, rng::AbstractRNG=default_rng()) = (sample = similar(model); poisson_sample!(sample, model, rng); return sample)
model3 = poisson_sample(model2)

Out[32]:

99×99 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 ⋮                        ⋮              ⋱                 ⋮              
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  2.0  1.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  1.0  0.0  1.0  0.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0  0.0  3.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  1.0  1.0  0.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0  3.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0  2.0  1.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  3.0  1.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  1.0  1.0  0.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  1.0  1.0  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [33]:

fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=true,figsize=(10,5))
fig.subplots_adjust(hspace=0.0,wspace=0.0)

im1=axs[1].imshow(permutedims(model2), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5), rasterized=true) 
axs[1].text(0.1,0.9,"Smooth",transform=axs[1].transAxes)
axs[2].imshow(permutedims(model3), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5,vmax=im1.get_clim()[2]), rasterized=true) 
axs[2].text(0.1,0.9,"Poisson-Sampled",transform=axs[2].transAxes)
axs[1].set_xlabel(L"F090W$-$F150W")
axs[2].set_xlabel(L"F090W$-$F150W")
axs[1].set_ylabel("F150W")
axs[1].set_ylim(reverse(extrema(edges[2]))) 
axs[1].set_xlim(extrema(edges[1]))
fig.colorbar(im1, ax=axs, pad=0.005)

Out[33]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc93e7fbc40>

To sample the color-magnitude diagram (CMD) directly we could use generate_stars_mass_composite,

In [34]:

import StarFormationHistories: generate_stars_mass_composite, NoBinaries
@time starcat = generate_stars_mass_composite(template_mini, template_isomags, ["F090W", "F150W"], stellar_mass, x0, imf; 
                                              dist_mod=distmod, binary_model=NoBinaries())

  2.242130 seconds (5.07 M allocations: 1.097 GiB, 196.20% compilation time)

Out[34]:

(Vector{StaticArraysCore.SVector{1, Float64}}[[[0.9083666536495625]], [[0.25784097909029025]], [[0.4508529121717616]], [[0.28918946166001386]], [], [[0.09860755299372617]], [[0.1157098623405801]], [[0.2865831715946679]], [[0.2951919349799215]], [[0.2103895528927366]]  …  [[0.24900826706006152]], [[0.15494321020253296]], [[0.3214294053167704]], [[0.10638255140558214]], [[0.7311418854198203]], [], [], [[0.3545041482558462]], [], []], Vector{StaticArraysCore.SVector{2, Float64}}[[[28.83548412583513, 28.16251701748439]], [[31.06845592701241, 30.084712142433734]], [[30.223431238843848, 29.35896777074739]], [[30.906001402920687, 29.930591665332777]], [], [[32.43334833524872, 31.28259315970932]], [[32.240249479821024, 31.09100820713627]], [[30.984995382267556, 29.903241962365932]], [[30.962172414592047, 29.86352478521305]], [[31.465621253685427, 30.319936011512176]]  …  [[33.89684858274402, 32.71972532675991]], [[35.14580340366173, 33.8737125440467]], [[33.463873172957875, 32.26021030807795]], [[36.666138988989545, 35.08847735462781]], [[30.082443423719994, 29.26735648185978]], [], [], [[33.59812899222108, 32.22392154871349]], [], []])

We'll concatenate the per-population samples returned from generate_stars_mass_composite into a single Vector{SVector}.

In [35]:

# starcat_mags = reduce(hcat,reduce(vcat,starcat[2])) # To make a 2D matrix
starcat_mags = reduce(vcat,starcat[2])

Out[35]:

14745458-element Vector{StaticArraysCore.SVector{2, Float64}}:
 [28.83548412583513, 28.16251701748439]
 [31.06845592701241, 30.084712142433734]
 [30.223431238843848, 29.35896777074739]
 [30.906001402920687, 29.930591665332777]
 [32.43334833524872, 31.28259315970932]
 [32.240249479821024, 31.09100820713627]
 [30.984995382267556, 29.903241962365932]
 [30.962172414592047, 29.86352478521305]
 [31.465621253685427, 30.319936011512176]
 [32.54314514411157, 31.33845329442149]
 [30.60402777374194, 29.481417207688807]
 [28.757377295246364, 28.041360258870604]
 [30.327305390926554, 29.200307596906036]
 ⋮
 [32.56969918456107, 31.522742477931146]
 [28.811678399368624, 28.287770165551606]
 [33.32266329800454, 32.21964529419607]
 [35.28428863798872, 34.08671293898492]
 [35.536530349227604, 34.30528600320833]
 [34.2038303134102, 33.03459815445435]
 [33.89684858274402, 32.71972532675991]
 [35.14580340366173, 33.8737125440467]
 [33.463873172957875, 32.26021030807795]
 [36.666138988989545, 35.08847735462781]
 [30.082443423719994, 29.26735648185978]
 [33.59812899222108, 32.22392154871349]

In [36]:

println(@sprintf("Number of sampled stars is %.2e",length(starcat_mags)))
println(@sprintf("Stellar mass of sampled system at birth was %.2e",stellar_mass))
println(@sprintf("Present-day stellar mass of sampled system is %.2e",sum(reduce(vcat,starcat[1]))[1]))
import StarFormationHistories: mag2flux, flux2mag
println(@sprintf("Absolute magnitude of sampled system is M_{F090W} = %.2f",flux2mag( sum( x -> mag2flux(first(x)), starcat_mags) ) - distmod))

Number of sampled stars is 1.47e+07
Stellar mass of sampled system at birth was 1.00e+07
Present-day stellar mass of sampled system is 4.61e+06
Absolute magnitude of sampled system is M_{F090W} = -13.29

The number of sampled stars is very large as we have not imposed a faint-end magnitude cut, which is supported through the mag_lim and mag_lim_name keyword arguments. However, the sampling is still very fast.

Now we'll mock observe the "pure" catalog with the same observational models we used to construct the templates using model_cmd.

In [37]:

# Construct new / read cached catalog (for reproducibility)
import StarFormationHistories: model_cmd
read_obs_mags::Bool = true
if read_obs_mags
    obs_mags = CSV.read(joinpath(@__DIR__,".ipynb_checkpoints/obs_mags.txt"), Table; delim=' ', header=1)
    obs_mags = permutedims([obs_mags.F090W obs_mags.F150W])
else
    # Apply observational effects to "pure" catalog
    obs_mags = model_cmd( starcat_mags, [F090W_error, F150W_error], [F090W_complete, F150W_complete])
    # Concatenate into 2D matrix
    obs_mags = reduce(hcat,obs_mags)
    # Optionally write out to file; use .ipynb_checkpoints to prevent commit to github
    write_obs_mags::Bool = false
    if write_obs_mags
        CSV.write(joinpath(@__DIR__,".ipynb_checkpoints/obs_mags.txt"), Table(F090W=obs_mags[1,:], F150W=obs_mags[2,:]); delim=' ')
    end
end
display(obs_mags)

2×212790 Matrix{Float64}:
 26.1109  29.4789  26.5488  26.5561  …  26.8878  27.9307  25.2945  27.7433
 26.3463  28.5098  26.6388  26.6696     26.2649  27.178   24.4521  27.1318

Now we'll compute the Hess diagram with bin_cmd and compare with our smooth and Poisson-sampled versions.

In [38]:

# Total number of stars in the observed CMD space
count( obs_mags[2,:] .< maximum(edges[2]) )

Out[38]:

In [39]:

import StarFormationHistories: bin_cmd
model4 = bin_cmd(view(obs_mags,1,:) .- view(obs_mags,2,:), view(obs_mags,2,:), edges=edges).weights

Out[39]:

99×99 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 ⋮                        ⋮              ⋱                 ⋮              
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  2.0  1.0  2.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  1.0  1.0  2.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  1.0  0.0  0.0  2.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  2.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [40]:

fig,axs=plt.subplots(nrows=1,ncols=3,sharex=true,sharey=true,figsize=(15,5))
fig.subplots_adjust(hspace=0.0,wspace=0.0)

im1=axs[3].imshow(permutedims(model2), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5), rasterized=true)
axs[3].text(0.1,0.9,"c) Smooth Model",transform=axs[3].transAxes)
axs[1].scatter(view(obs_mags,1,:) .- view(obs_mags,2,:), view(obs_mags,2,:), s=1, marker=".", c="k", alpha=0.05, rasterized=true, label="CMD-Sampled")
axs[1].text(0.1,0.9,"a) Sampled CMD",transform=axs[1].transAxes)
axs[2].imshow(permutedims(model4), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5,vmax=im1.get_clim()[2]), rasterized=true, label="CMD-Sampled")
axs[2].text(0.1,0.9,"b) Sampled Hess Diagram",transform=axs[2].transAxes)

axs[1].set_xlabel(L"F090W$-$F150W")
axs[2].set_xlabel(L"F090W$-$F150W")
axs[3].set_xlabel(L"F090W$-$F150W")
axs[1].set_ylabel("F150W")
axs[1].set_ylim(reverse(extrema(edges[2]))) 
axs[1].set_xlim(extrema(edges[1]))

fig.colorbar(im1, ax=axs, pad=0.005)
# plt.savefig("mcsample_example.pdf",bbox_inches="tight")
# fig.colorbar(im1)

Out[40]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc93e11ce80>

In [41]:

fig,axs=plt.subplots(nrows=1,ncols=3,sharex=true,sharey=true,figsize=(15,5))
fig.subplots_adjust(hspace=0.0,wspace=0.0)

im1=axs[1].imshow(permutedims(model2), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5), rasterized=true) 
axs[1].text(0.1,0.9,"Smooth Model",transform=axs[1].transAxes)
axs[2].imshow(permutedims(model3), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5,vmax=im1.get_clim()[2]), rasterized=true) 
axs[2].text(0.1,0.9,"Poisson-Sampled",transform=axs[2].transAxes)
axs[3].imshow(permutedims(model4), origin="lower", 
    extent=(extrema(edges[1])..., extrema(edges[2])...), 
    aspect="auto", cmap="Greys", norm=plt.matplotlib.colors.LogNorm(vmin=2.5,vmax=im1.get_clim()[2]), rasterized=true, label="CMD-Sampled") 
axs[3].text(0.1,0.9,"CMD-Sampled",transform=axs[3].transAxes)

axs[1].set_xlabel(L"F090W$-$F150W")
axs[2].set_xlabel(L"F090W$-$F150W")
axs[3].set_xlabel(L"F090W$-$F150W")
axs[1].set_ylabel("F150W")
axs[1].set_ylim(reverse(extrema(edges[2]))) 
axs[1].set_xlim(extrema(edges[1]))
fig.colorbar(im1, ax=axs, pad=0.005)

Out[41]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc8e4c29570>

The mock Hess diagram created by Poisson-sampling the smooth model (labelled "Poisson-Sampled") and the Hess diagram created by sampling stars via generate_stars_mass_composite and mock observing them with model_cmd (labelled "CMD-Sampled") are highly consistent. The Poisson-sampled method is more efficient if a Hess diagram is all that is required, but if you need individual stars (e.g., for forming a star catalog for injection into images) then you should go the CMD-sampled route.

In [42]:

# Choose which sampled Hess diagram to use for model fitting; choosing CMD-sampled instance here.
data = model4

Out[42]:

99×99 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 ⋮                        ⋮              ⋱                 ⋮              
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  2.0  1.0  2.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  1.0  1.0  2.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  1.0  0.0  0.0  2.0  1.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  2.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0

SFH Fitting: Constrained Metallicity Evolution¶

Given that we constructed our model population with a linear [M/H] AMR, it will be optimally recovered when fitting under the same general AMR model. This can be achieved by using fit_templates_mdf to fit the SFH simultaneously with a linear [M/H] AMR using the full template grid.

Note that in real data, the parameter σ (which sets the width of the metallicity distribution function at fixed age) can be difficult to constrain. As such, it is often preferable to fix σ rather than fitting it. This approach is supported by fit_templates_mdf. For the purposes of this example, we will allow σ to be fit to show it can be recovered robustly in the case of synthetic data.

In [43]:

import StarFormationHistories: fit_templates_mdf
mdf_α, mdf_β, mdf_σ = 0.15, -2.25, 0.2 # -0.15, -0.2, 0.2 # initial guess for the metallicity evolution parameters
mdf_result = fit_templates_mdf(templates,
                               data,
                               template_logAge, template_MH, T_max;
                               x0=vcat(construct_x0_mdf(template_logAge, T_max; normalize_value=1e4),
                                       [mdf_α,mdf_β,mdf_σ]))

Out[43]:

(map = StarFormationHistories.LogTransformMDFResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([3.508418671731614, 1.3456027739328649, 1.0069690221762135, 0.8179572713196054, 0.8114159857527945, 0.973616395070588, 0.9800666895329524, 1.1068831105088601, 1.627718479821283, 1.4029916662040582  …  817.2169374177233, 388.2087514806638, 753.3490647864803, 842.4088046843362, 935.6366391060493, 1028.8649686430667, 791.3828033459278, 0.10138399732854143, -1.8854585348214077, 0.11351451057635713], [1.3754429895196163, 1.2228210488351765, 0.9616598534288513, 0.7906072043590903, 0.7862199738985534, 0.9370379361996616, 0.9462492094071742, 1.0602850296711588, 1.4571030890531307, 1.3102595018934462  …  295.55596786021795, 207.2441954647195, 124.5760310436129, 125.73288926876639, 98.05987003977926, 70.41750852759125, 45.00515335523419, 0.0005095772160533307, 0.005645224478769559, 0.0024158616855296244], [0.1536959277902389 -0.2012479026374323 … -9.066285364257077e-6 -0.0004719808093076918; -0.2012479026374323 0.8258326784938467 … -4.780771299072974e-5 0.0007565413117440076; … ; -9.066285364257078e-6 -4.780771299072917e-5 … 3.186855941569903e-5 -2.2238703663132454e-5; -0.00047198080930769675 0.0007565413117440026 … -2.223870366296521e-5 0.000452940724361891],  * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     1.256451e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 3.96e-11 ≰ 0.0e+00
    |x - x'|/|x'|          = 5.70e-12 ≰ 0.0e+00
    |f(x) - f(x')|         = 9.09e-13 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 7.24e-16 ≰ 0.0e+00
    |g(x)|                 = 1.58e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   3  (vs limit Inf)
    Iterations:    118
    f(x) calls:    305
    ∇f(x) calls:   305
), mle = StarFormationHistories.LogTransformMDFResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([6.49197913392421, 1.8831726965613177e-53, 5.833991756479867e-65, 4.614386099244847e-130, 2.3648727945422765e-132, 8.116237772297423e-46, 3.1435111543119115e-62, 4.3629488093216e-52, 8.107472575500926, 3.070088997553774e-16  …  1184.4005081825965, 157.8013550908007, 805.5696616625887, 831.6850796475286, 945.4230008642562, 1030.642043911546, 791.5889939853312, 0.10133019711012366, -1.8851099765568422, 0.11354675283382741], [1.122498593522727, 5.988005839522135e-44, 2.2433129535701402e-55, 3.580065792978467e-120, 1.8672429196194105e-122, 2.1783633503530182e-36, 1.155892176672584e-52, 1.3466859798606693e-42, 2.787732290771115, 2.925182907506664e-7  …  260.670443157142, 220.00561257537694, 127.51721367664335, 124.55539364072035, 99.22395505960108, 72.09801891726873, 46.64585469972445, 0.0005287507371339599, 0.005844070233860732, 0.002494864950187422], [0.029896295704035975 -1.707847037432542e7 … -1.5143354898452554e-5 -9.580741522022524e-5; -1.7078470374325424e7 1.0110767706793347e19 … 32194.9316844353 -1.0999125634158703e6; … ; -1.5143354898321254e-5 32194.931684435323 … 3.4153156898297024e-5 -2.669565308056047e-5; -9.580741522007874e-5 -1.0999125634158712e6 … -2.6695653080819717e-5 0.0004827748589226024],  * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     1.461988e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 6.93e+00 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.29e-02 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.14e-12 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 7.78e-16 ≰ 0.0e+00
    |g(x)|                 = 6.03e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   4  (vs limit Inf)
    Iterations:    138
    f(x) calls:    402
    ∇f(x) calls:   402
))

We will now calculate the per-template coefficients using calculate_coeffs_mdf, which we can use as input for other methods like calculate_cum_sfr, which computes cumulative SFHs, SFRs, and <[M/H]> from the fitted coefficients.

In [44]:

mdf_coeffs = calculate_coeffs_mdf(mdf_result.map.μ, template_logAge, template_MH, T_max)
mdf_coeffs *= template_norm
            
_, mdf_cum_sfr_arr, mdf_sfr_arr, mdf_mean_mh_arr =
    calculate_cum_sfr(mdf_coeffs, template_logAge, template_MH, T_max)

Out[44]:

([6.6, 6.65, 6.7, 6.75, 6.8, 6.85, 6.9, 6.95, 7.0, 7.05  …  9.65001, 9.70001, 9.75001, 9.80001, 9.85001, 9.90001, 9.95001, 10.00001, 10.05001, 10.10001], [0.9999999999999994, 0.9996524145352975, 0.9995191032035429, 0.9994193409379006, 0.9993383044126568, 0.9992579159445568, 0.9991614579847788, 0.9990643609825126, 0.9989547000451117, 0.9987934389935812  …  0.6695482940208246, 0.6387911496512169, 0.5822108017625277, 0.5505489034446404, 0.4695857244365759, 0.4311251722875444, 0.3564895004173606, 0.2730305169714941, 0.18033528119014633, 0.07840374402133701], [0.007222472459590545, 0.0024688309578686693, 0.001646609403293299, 0.0011920791672050172, 0.0010539452244585931, 0.0011270997653305918, 0.001011183814583176, 0.0010178323302413113, 0.0013339936890145445, 0.0010247774780065195  …  0.0005695873018942864, 0.0009338559732956879, 0.00046574825610259284, 0.0010614556179324925, 0.00044939662074661515, 0.0007772500064307474, 0.0007746176318873708, 0.0007667816322348169, 0.0007514894036579741, 0.0007126647098469802], [-0.49690203568706076, -0.4969512857232596, -0.497006545172198, -0.4970685472929845, -0.49713811481583403, -0.49721617085918307, -0.4973037511788922, -0.4974020179120746, -0.49751227499792156, -0.497635985480146  …  -0.9493731319162152, -1.0046350541048052, -1.0666397133606857, -1.1362062058616447, -1.2142572134555971, -1.3018353690777895, -1.4001048520639439, -1.5103616781449871, -1.6340683076470275, -1.7727467887843222])

Of additional note are the uncertainty estimates, available as mdf_result.map.σ and mdf_result.mle.σ. These are essentially standard errors derived from the diagonal of the covariance matrix of the fitting parameters (discussed more below). The mdf_result.map result contains the maximum a posteriori result and is often very comparable to the mode of posterior samples obtained via Hamiltonian Monte Carlo (provided by hmc_sample_mdf). We don't generally recommend the use of mdf_result.mle.σ, but provide it for completeness.

In [45]:

import Measurements: uncertainty, ±
mdf_result.map.μ .± mdf_result.map.σ

Out[45]:

74-element Vector{Measurements.Measurement{Float64}}:
     3.5 ± 1.4
     1.3 ± 1.2
    1.01 ± 0.96
    0.82 ± 0.79
    0.81 ± 0.79
    0.97 ± 0.94
    0.98 ± 0.95
     1.1 ± 1.1
     1.6 ± 1.5
     1.4 ± 1.3
     1.1 ± 1.0
     1.3 ± 1.2
     1.4 ± 1.3
         ⋮
   570.0 ± 230.0
   320.0 ± 240.0
   820.0 ± 300.0
   390.0 ± 210.0
   750.0 ± 120.0
   840.0 ± 130.0
   936.0 ± 98.0
  1029.0 ± 70.0
   791.0 ± 45.0
 0.10138 ± 0.00051
 -1.8855 ± 0.0056
  0.1135 ± 0.0024

Sampling from Inverse Hessian¶

Another thing we can do is draw samples from the posterior approximating the posterior distribution as Gaussians and the inverse Hessian as its variance-covariance matrix. This is preferred to using the standard error uncertainties in mdf_result.map.σ as samples will include covariances. There are additional notes in the documentation on this approximation, but this sampling method often reproduces results from more expensive Monte Carlo sampling methods fairly well. In short, we are approximating the posterior (in the transformed fitting variables θ = log(coeffs)) as a multivariate Gaussian in the region of the MAP estimate. For a multivariate Gaussian, the inverse of the Hessian matrix (which the BFGS algorithm from Optim.jl computes for us) is an estimator for the variance-covariance matrix. In concert with the MAP estimate for the parameters, this gives us both the means μ and the variance-covariance matrix Σ needed to define our multivariate Gaussian approximation of the posterior, which we can then sample from using MvNormal from Distributions.jl.

In [46]:

import Statistics: mean, median, quantile

In [47]:

# # This is effectively what rand(mdf_result.map, 10000) is doing.
# mdf_dist = MvNormal(Optim.minimizer(mdf_result.map.result),
#                     Hermitian(Optim.trace(mdf_result.map.result)[end].metadata["~inv(H)"]))
# mdf_sample = rand(mdf_dist, 10000)

# # Transform the variables, noting that the stellar mass coefficients `mdf_sample[begin:end-3,:]` 
# # and σ `mdf_sample[end,:]` are fit with logarithmic transformations so we have to transform them back.
# # α and β are optimized directly, without a transformation.
# @views mdf_sample[begin:end-3,:] .= exp.(mdf_sample[begin:end-3,:]) .* template_norm
# @views mdf_sample[end,:] .= exp.(mdf_sample[end,:])

In [48]:

mdf_sample = rand(mdf_result.map, 10000)
@views mdf_sample[begin:end-3,:] *= template_norm # Correct the stellar mass coefficients for the template normalization
# Calculate the cumulative SFH for each point in the sample
# and find the 1-σ range
mdf_cum_sfr = Vector{Vector{Float64}}(undef,0)
mdf_sfr = Vector{Vector{Float64}}(undef,0)
mdf_mh = Vector{Vector{Float64}}(undef,0)
for x in eachcol(mdf_sample)
    tmp_coeffs = calculate_coeffs_mdf(x, template_logAge, template_MH, T_max)
    _, mdf_1, mdf_2, mdf_3 = calculate_cum_sfr(tmp_coeffs, template_logAge, template_MH, T_max)
    push!(mdf_cum_sfr, mdf_1)
    push!(mdf_sfr, mdf_2)
    push!(mdf_mh, mdf_3)
end
mdf_cum_sfr = reduce(hcat, mdf_cum_sfr) 
mdf_sfr = reduce(hcat, mdf_sfr) 
mdf_mh = reduce(hcat, mdf_mh) 

# Now calculate quantiles
mdf_cum_lower = quantile.(eachrow(mdf_cum_sfr), 0.16)
mdf_cum_med = median.(eachrow(mdf_cum_sfr))
mdf_cum_upper = quantile.(eachrow(mdf_cum_sfr), 0.84)
mdf_sfr_lower = quantile.(eachrow(mdf_sfr), 0.16)
mdf_sfr_med = median.(eachrow(mdf_sfr))
mdf_sfr_upper = quantile.(eachrow(mdf_sfr), 0.84)
mdf_mh_lower = quantile.(eachrow(mdf_mh), 0.16)
mdf_mh_med = median.(eachrow(mdf_mh))
mdf_mh_upper = quantile.(eachrow(mdf_mh), 0.84);

We can look at the sampled values for the AMR slope α, intercept β, and MDF width σ:

In [49]:

println((α, β, σ))
mdf_sample[end-2:end,:]

(0.1, -1.87, 0.1)

Out[49]:

3×10000 Matrix{Float64}:
  0.101585   0.101617   0.102398  …   0.100629   0.102003   0.102155
 -1.88572   -1.88779   -1.89482      -1.87759   -1.89185   -1.89197
  0.111513   0.11937    0.112699      0.114819   0.109386   0.111424

We can look at the marginal distributions of the metallicity variables using the "corner" python package:

In [50]:

@pyimport corner # Easy marginal distributions
fig=plt.figure(figsize=(7,7))
# fig.suptitle("Linear M/H")
corner.corner( permutedims(@view mdf_sample[end-2:end,:]),
                fig=fig,
                labels=[L"\alpha", L"\beta", L"\sigma"],
                quantiles=[0.16,0.5,0.84],
                max_n_ticks=4,
                show_titles=false,
                title_kwargs=Dict("fontsize"=>17),
                label_kwargs=Dict("fontsize"=>25))
fig.subplots_adjust(hspace=0.0, wspace=0.0)

We see that the inverse Hessian has captured some covariance between the parameters. Now we'll plot the cumulative SFH and metallicity evolution.

In [51]:

# Now plot cumulative SFH and MH evolution for the metallicity evolution model
fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=false,figsize=(10,5))
fig.subplots_adjust(hspace=0.0,wspace=0.35)

axs[1].plot( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(mdf_cum_sfr_arr,0.0), label="Result" ) # This is MAP result
# axs[1].plot( exp10.(unique_template_logAge)./1e9, mdf_cum_med, label="Result" ) # This is median of samples
axs[1].plot( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(cum_sfr_arr,0.0), label="Input SFH", ls="--" )
axs[1].fill_between( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(mdf_cum_lower,0.0), vcat(mdf_cum_upper,0.0), alpha=0.3, fc="k")

axs[1].set_xlim([T_max,0.0])
axs[1].set_ylim([0.0,1.05])
axs[1].set_xlabel("Lookback Time [Gyr]")
axs[1].set_ylabel("Cumulative SF")
axs[1].legend()

axs[2].plot( vcat( exp10.(unique_template_logAge)./1e9, T_max), vcat(mdf_mh_med, mdf_result.map.μ[end-1]), label="Result" )
axs[2].fill_between( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(mdf_mh_lower, mdf_result.map.μ[end-1]), vcat(mdf_mh_upper, mdf_result.map.μ[end-1]), alpha=0.3, fc="k")
axs[2].plot( vcat(exp10.(unique_template_logAge)./1e9, T_max), vcat(mean_mh_arr, β), label="Input AMR", ls="--" )

axs[2].set_xlabel("Lookback Time [Gyr]")
axs[2].set_ylabel(L"$\langle$[M/H]$\rangle$")
axs[2].legend()
# plt.savefig("example_cumsfh.pdf", bbox_inches="tight")

Out[51]:

PyObject <matplotlib.legend.Legend object at 0x7fc8bb924340>

The above plots show that we are able to measure the cumulative SFH and age-metallicity relation simultaneously and that they are statistically consistent with the input model. We will now look at the SFRs in each log(Age) bin.

In [52]:

[ vcat(exp10.(unique_template_logAge)./1e9, T_max) 1.0.-vcat(cum_sfr_arr,0.0) ]

Out[52]:

72×2 Matrix{Float64}:
  0.00398107  0.0
  0.00446684  3.54675e-5
  0.00501187  7.52628e-5
  0.00562341  0.000119914
  0.00630957  0.000170013
  0.00707946  0.000226225
  0.00794328  0.000289297
  0.00891251  0.000360064
  0.01        0.000439466
  0.0112202   0.000528556
  0.0125893   0.000628517
  0.0141254   0.000740675
  0.0158489   0.000866519
  ⋮           
  3.98116     0.29039
  4.46694     0.325858
  5.01199     0.365654
  5.62354     0.410306
  6.30972     0.460407
  7.07962     0.51662
  7.94347     0.579693
  8.91271     0.650462
 10.0002      0.729865
 11.2204      0.818958
 12.5895      0.918921
 13.7         1.0

In [53]:

# Construct an interpolator that, given a fraction of total stellar mass `x`, will find the lookback time `t` 
# at which the fraction of stellar mass formed *more recently* than `t` is `x`.
import Interpolations: extrapolate, interpolate, Gridded, Linear, Flat
# recent_sfh_interp = extrapolate(interpolate((vcat(exp10.(unique_template_logAge)./1e9, T_max),), 1 .- vcat(cum_sfr_arr,0.0), Gridded(Linear())), Flat())
sfh_quantile = extrapolate(interpolate((1 .- vcat(cum_sfr_arr,0.0),), vcat(exp10.(unique_template_logAge)./1e9, T_max), Gridded(Linear())), Flat());

In [54]:

log10(sfh_quantile(0.001)) + 9

Out[54]:

7.247410788638328

In [55]:

# Now plot the SFRs for the metallicity evolution model
fig,ax1 = plt.subplots(figsize=(7,7))
ax1.tick_params(axis="both", which="both", labelsize=16)
# ax1.bar(unique_template_logAge, mdf_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
ax1.bar(unique_template_logAge, mdf_sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(mdf_sfr_med .- mdf_sfr_lower) .* 1e3, 
            (mdf_sfr_upper .- mdf_sfr_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="Result")
ax1.bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)

ax1.set_xlabel("log(Age [yr])", fontsize=20)
ax1.set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]", fontsize=20)
ax1.set_ylim([0.0, 9.0]) # ax1.set_ylim([0.0, ax1.get_ylim()[2]])

# Draw vertical lines to denote times at which `x` fraction of stellar mass was formed more recently
for q in ("0.01", "0.001")
    laq = log10(sfh_quantile(parse(Float64, q))) + 9
    println(laq)
    ax1.axvline(laq, ls="--", c="red")
    # ax1.text( log10(sfh_quantile(qn)) + 9, ax1.get_ylim()[2] - 1, L"$\leftarrow$ \n $ %$q \text{M}_*$", transform=ax1.transData, ha="right", va="top")
    t = ax1.text(laq - 0.02, ax1.get_ylim()[2] - 0.5, "\$\\leftarrow\$ \n \$ $q \\, \\text{M}_*\$", transform=ax1.transData, ha="right", va="top", fontsize=20)
    t.set_bbox(Dict("facecolor"=>"red", "alpha"=>0.25, "edgecolor"=>"red", "lw"=>2, "boxstyle"=>"round"))
end
ax1.legend(fontsize=20)
# plt.savefig("example_sfrs_hires.pdf", bbox_inches="tight")

8.149038152771372
7.247410788638328

Out[55]:

PyObject <matplotlib.legend.Legend object at 0x7fc8bb711660>

It is important to note that there can be strong covariances between the SFRs (or stellar mass coefficients) of neighboring time bins, as their templates can be very similar. We can calculate the correlation matrix of our posterior samples to examine these correlations.

In [56]:

import Statistics: cor
cor(mdf_sample[1:5,:]; dims=2)

Out[56]:

5×5 Matrix{Float64}:
  1.0        -0.39021    -0.204723   -0.106157   -0.0689796
 -0.39021     1.0        -0.0561793  -0.0479354  -0.0201329
 -0.204723   -0.0561793   1.0        -0.0555755  -0.0372933
 -0.106157   -0.0479354  -0.0555755   1.0        -0.0290286
 -0.0689796  -0.0201329  -0.0372933  -0.0290286   1.0

In [57]:

# Young stellar pops show minimal correlation
plt.imshow(cor(mdf_sample[1:10,:]; dims=2))
plt.colorbar()

Out[57]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc8bb5aa3b0>

In [58]:

# Old stellar pops show greater correlations
plt.imshow(cor(mdf_sample[end-20:end-3,:]; dims=2), clim=(-1,1))
plt.colorbar()

Out[58]:

PyObject <matplotlib.colorbar.Colorbar object at 0x7fc8bb663160>

It can also be instructive to look at corner plots of the samples to examine their covariances,

In [59]:

fig=plt.figure(figsize=(7,7))
corner.corner( permutedims(@view mdf_sample[end-20:end-10,:]) .* 1e3,
# corner.corner( permutedims(@view mdf_sfr_sample[1:5,:]) .* 1e3,
                fig=fig,
                quantiles=[0.16,0.5,0.84],
                max_n_ticks=0, # turn ticks off, illustrating shape only
                show_titles=false,
                title_kwargs=Dict("fontsize"=>17),
                label_kwargs=Dict("fontsize"=>25))
fig.subplots_adjust(hspace=0.0, wspace=0.0)

We can also look at the overall mass-weighted metallicity distribution function that results from the combination of our star formation history and our age-metallicity relation.

In [60]:

# Make integrated MDF plot
import StarFormationHistories: mdf_amr
MDF_x, MDF = mdf_amr(mdf_coeffs, template_logAge, template_MH)
fig, ax1 = plt.subplots()
ax1.bar(MDF_x, MDF, -abs(MDF_x[2] - MDF_x[3]), align="edge", linewidth=1, edgecolor="k")
ax1.set_xlabel("[M/H]")
ax1.set_ylabel("Probability")

Out[60]:

PyObject Text(27.219312263257578, 0.5, 'Probability')

Overall the resulting fit is very consistent with the true values. We do, however, see that the SFRs at recent times (small log(Age)) are slightly higher than expected. This is in large part due to our grid being equally-spaced in log(Age) with a very fine time spacing ($\Delta$log(Age)$=0.1$). As such, the most recent time bins contain very little stellar mass. The total stellar mass for all bins with logAge < 7 is

In [61]:

# Equivalent to sum(mdf_coeffs[template_logAge .<= maximum(unique_template_logAge[1:5])])
# sum(mdf_result.map.μ[1:5] .* template_norm) 
sum(mdf_result.map.μ[findall(<(7), unique_logAge)] .* template_norm)

Out[61]:

10550.929920025494

which is of order 0.1% of the stellar mass of the system.

In [62]:

sum(mdf_result.map.μ[findall(<(7), unique_logAge)] .* template_norm) / stellar_mass * 100

Out[62]:

0.10550929920025494

Normalizing Templates to SFR¶

It can sometimes be convenient to normalize the templates to uniform SFR, rather than uniform stellar mass per template. Below we illustrate this gives the same SFH result as above, where we used templates normalized to uniform stellar mass per template.

In [63]:

# Compute change in time [yr] between bins of logAge
template_δt = diff(vcat(exp10.(unique_template_logAge), T_max * 1e9)) # exp10(max_logAge)))
sfr_norm = 1e-3 # Star formation rate in solar masses / yr to normalize templates to
# Now normalize the templates:
# SFR = stellar mass / dt; sfr_norm = A * template_norm / dt; where A is a multiplicative constant
# since we originally normalized all the templates to contain stellar mass = `template_norm`
# Solve for A = sfr_norm * dt / template_norm
sfr_templates = deepcopy(templates)
for i in eachindex(sfr_templates)
    idx = findfirst(==(template_logAge[i]), unique_template_logAge)
    A = sfr_norm * template_δt[idx] / template_norm
    sfr_templates[i] .*= A
end

In [64]:

# And now fit
mdf_sfr_result = fit_templates_mdf(sfr_templates, data, template_logAge, template_MH, T_max;
                                   x0=vcat(fill(1,length(unique_template_logAge)),
                                           [mdf_α, mdf_β, mdf_σ]))

Out[64]:

(map = StarFormationHistories.LogTransformMDFResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([7.222472459574194, 2.468830957881689, 1.6466094032999312, 1.192079167196366, 1.0539452244539713, 1.127099765328633, 1.0111838145831034, 1.0178323302411438, 1.3339936890048765, 1.0247774780082528  …  1.0614556179282688, 0.44939662075002235, 0.7772500064301425, 0.774617631886647, 0.7667816322341894, 0.7514894036582785, 0.7126647098470569, 0.10138399732854053, -1.8854585348213766, 0.11351451057638239], [2.8250481623609063, 2.2619095688462068, 1.5435897165692978, 1.14218857311559, 1.0150312596409026, 1.0784499093364994, 0.9779813628585388, 0.9784190675836162, 1.1961202024155706, 0.9597032665257027  …  0.38514708924629004, 0.24079096918352622, 0.1278396776116526, 0.11757235160627454, 0.08189279732559253, 0.05278488582806115, 0.03817665968940224, 0.0005007502358561787, 0.005460049291052889, 0.0024818534810153773], [0.15299594606269112 -0.2063178963463079 … 1.5489043614109188e-5 -0.0001867647501861481; -0.20631789634630773 0.8393976838211493 … -6.4449415922932055e-6 -0.00015645029359665825; … ; 1.5489043614110428e-5 -6.444941592291868e-6 … 2.981213826072716e-5 -2.049108913639282e-5; -0.00018676475018614426 -0.0001564502935966578 … -2.0491089136315077e-5 0.00047802379534603935],  * Status: success

 * Candidate solution
    Final objective value:     1.490959e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 5.79e-11 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.53e-11 ≰ 0.0e+00
    |f(x) - f(x')|         = 4.55e-13 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 3.05e-16 ≰ 0.0e+00
    |g(x)|                 = 6.03e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   3  (vs limit Inf)
    Iterations:    122
    f(x) calls:    323
    ∇f(x) calls:   323
), mle = StarFormationHistories.LogTransformMDFResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([13.364465558456837, 3.4552713108476626e-53, 9.540288048155317e-65, 6.7256021917873845e-130, 3.072032363021138e-132, 9.396022222634695e-46, 3.243469016347785e-62, 4.012098716579499e-52, 6.644464250823199, 2.242490967704015e-16  …  1.5383780190178578, 0.18267335668391074, 0.8311273670793279, 0.7647568772907527, 0.7748018423558825, 0.7527873905410589, 0.7128503908999005, 0.10133019711012366, -1.8851099765568422, 0.11354675283382741], [2.3107897703715286, 1.0986791195486064e-43, 3.668446507640307e-55, 5.218009960507355e-120, 2.425579273399506e-122, 2.5218325476097392e-36, 1.1926382189066543e-52, 1.2383815748568264e-42, 2.2846836249424176, 2.1366304548057905e-7  …  0.338576276793382, 0.2546819908440283, 0.13156283687737386, 0.11453205081852054, 0.08131693455278355, 0.052660846596762956, 0.04200603566841636, 0.000528750761340089, 0.005844070268220909, 0.002494865413033796], [0.029896316348573638 -1.7078552184837636e7 … -1.5143328440465426e-5 -9.580525865584503e-5; -1.707855218483764e7 1.011061173777915e19 … 32249.07835344585 -1.1005093891764833e6; … ; -1.5143328440362529e-5 32249.07835344584 … 3.41531572999036e-5 -2.6695652086533063e-5; -9.580525865591514e-5 -1.1005093891764835e6 … -2.669565208645928e-5 0.00048277503805102675],  * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     1.461988e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 6.93e+00 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.29e-02 ≰ 0.0e+00
    |f(x) - f(x')|         = 2.27e-13 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.56e-16 ≰ 0.0e+00
    |g(x)|                 = 6.03e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   4  (vs limit Inf)
    Iterations:    138
    f(x) calls:    402
    ∇f(x) calls:   402
))

In [65]:

# # This is effectively what rand(mdf_sfr_result.map, 10000) is doing.
# # Generate sample from the inverse Hessian matrix
# mdf_sfr_dist = MvNormal(Optim.minimizer(mdf_sfr_result.map.result),
#                         Hermitian(Optim.trace(mdf_sfr_result.map.result)[end].metadata["~inv(H)"]))
# mdf_sfr_sample = rand(mdf_sfr_dist, 10000)

# # Transform the variables, noting that the SFR coefficients `mdf_sfr_sample[begin:end-3,:]`, α `mdf_sfr_sample[end-2,:]`,
# # and σ `mdf_sfr_sample[end,:]` are fit with logarithmic transformations so we have to transform them back.
# # β is optimized directly, without a transformation, as it is allowed to be negative.
# @views mdf_sfr_sample[begin:end-2,:] .= exp.(mdf_sfr_sample[begin:end-2,:]) .* sfr_norm
# @views mdf_sfr_sample[end,:] .= exp.(mdf_sfr_sample[end,:]);

In [66]:

mdf_sfr_sample = rand(mdf_sfr_result.map, 10000)
@views mdf_sfr_sample[begin:end-3,:] *= sfr_norm # Correct the SFR coefficients for the template normalization

Out[66]:

71×10000 Matrix{Float64}:
 0.014119     0.00498275   0.00912795   …  0.0025779    0.0115282
 0.00147263   0.00546521   0.00101365      0.00156416   0.0033645
 0.000838398  0.00301883   0.002098        0.00492418   0.00119213
 0.000326765  0.00221172   0.00210687      0.0101222    0.000841247
 0.00173096   0.00297255   0.000603937     0.00452283   0.000723277
 0.00255772   0.00600708   0.00102689   …  0.00309843   0.000682239
 0.00112329   0.00217053   0.00266275      0.000997668  0.00267296
 0.00214535   0.0033102    0.000447465     0.00176839   0.0012357
 0.000271859  0.000194402  0.000485646     0.00195109   0.000204272
 0.000749874  0.00225216   0.000731541     0.00364299   0.0025928
 0.000937123  0.00105515   0.00108799   …  0.000102818  0.00127776
 0.00153389   0.000424915  0.000361359     0.000153056  0.000704236
 0.000678687  0.000312023  0.000198507     0.000103068  0.000485186
 ⋮                                      ⋱               
 0.000351858  0.00105834   0.000642025     0.000507256  0.000871406
 0.00149486   0.000588927  0.000655072  …  0.00114862   0.00047633
 0.000306447  0.000574588  0.000803473     0.000564577  0.00136934
 0.00147384   0.000854081  0.00104109      0.000688945  0.000881082
 9.41651e-5   0.00104247   0.000207967     0.000746678  0.000235776
 0.00230675   0.000871546  0.00159173      0.0010083    0.00146578
 0.00017393   0.000407772  0.000290916  …  0.000483488  0.000353837
 0.000862812  0.000856684  0.000749756     0.000768668  0.000687389
 0.000803549  0.000729406  0.000810721     0.000707933  0.000902153
 0.000778658  0.000758133  0.00082588      0.000846178  0.000693
 0.000746285  0.000745703  0.000708325     0.000724871  0.000809393
 0.0007027    0.000707212  0.000749937  …  0.000746109  0.000685981

In [67]:

# Now calculate quantiles
mdf_sfr_norm_lower = quantile.(eachrow(@view mdf_sfr_sample[begin:end-3,:]), 0.16)
mdf_sfr_norm_med = median.(eachrow(@view mdf_sfr_sample[begin:end-3,:]))
mdf_sfr_norm_upper = quantile.(eachrow(@view mdf_sfr_sample[begin:end-3,:]), 0.84);

In [68]:

fig,axs = plt.subplots(nrows=1,ncols=2,sharex=true,sharey=true,figsize=(14,7))
fig.subplots_adjust(hspace=0.0,wspace=0.1)

axs[1].bar(unique_template_logAge, mdf_sfr_norm_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
# axs[1].bar(unique_template_logAge, mdf_sfr_result.map.μ[begin:end-3] .* 1e3 .* sfr_norm; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
# axs[1].bar(unique_template_logAge, mdf_sfr_result.mle.μ[begin:end-3] .* 1e3 .* sfr_norm; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(mdf_sfr_norm_med .- mdf_sfr_norm_lower) .* 1e3, 
        (mdf_sfr_norm_upper .- mdf_sfr_norm_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="Result, SFR Normalized")
axs[1].bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)

axs[1].set_xlabel("log(Age [yr])")
axs[1].set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
axs[1].set_ylim([0.0, axs[1].get_ylim()[2]])
axs[1].legend()

axs[2].bar(unique_template_logAge, mdf_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(mdf_sfr_med .- mdf_sfr_lower) .* 1e3, 
            (mdf_sfr_upper .- mdf_sfr_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="Result, Stellar Mass Normalized")
axs[2].bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)
axs[2].legend()
axs[2].set_xlabel("log(Age [yr])")

Out[68]:

PyObject Text(0.5, 29.371800193718016, 'log(Age [yr])')

SFH Fitting: Unconstrained Fitting¶

We will now explore the methods offered to fit SFHs without constraints on the AMR. These methods are most useful when dealing with simpler stellar populations (e.g., globular clusters / ultra-faint dwarf galaxies).

Simple Optimization¶

Again, the best way to fit this SFH would be to use fit_templates_mdf as above, because it uses the same hierarchical model as we used to generate the SFH (fitting stellar masses per bin in logage, and a mean metallicity relation with Gaussian scatter). For example purposes we will show how to use some of the simpler fitting methods that fit per-template stellar masses here. These methods are generally most useful when you have a smaller set of templates that you want to use to fit the SFH. For this purpose, we will create a smaller grid of templates centered around the real age-metallicity relation; we will accept any isochrone within ± 0.2 dex of this metallicity evolution guess.

In [69]:

# new_α, new_β, new_σ = 0.15, -2.25, 0.2 # -0.15, -0.2, 0.2
new_α, new_β, new_σ = α, β, σ
met_func(logAge) = (T_max - exp10(logAge)/1e9) * new_α + new_β
met_func_true(logAge) = (T_max - exp10(logAge)/1e9) * α + β
laxplot = 6.6:0.01:10.2
fig,ax1=plt.subplots()
ax1.plot(exp10.(laxplot)./1e9, met_func.(laxplot),label="Guess",zorder=0)
ax1.plot(exp10.(laxplot)./1e9, met_func_true.(laxplot),label="Input",ls="--",zorder=1)
ax1.set_xlim([13.0,-0.1])
ax1.set_xlabel("Lookback Time [Gyr]")
ax1.set_ylabel(L"$\langle$[M/H]$\rangle$")
ax1.legend(loc="upper left")

Out[69]:

PyObject <matplotlib.legend.Legend object at 0x7fc8b9bea4d0>

Now create the limited template set.

In [70]:

# Single-threaded with push!, easier with this example
free_templates = Vector{Matrix{Float64}}(undef,0)
free_template_logAge = Vector{Float64}(undef,0)
free_template_MH = Vector{Float64}(undef,0)
for logage in unique_logAge
    mean_met = met_func(logage)
    for mh in unique_MH
        # If the current mh is more than `new_σ` dex away from mean, skip template
        abs(mh - mean_met) > new_σ && continue
        local good = findall( (table.MH .== mh) .& (table.logAge .== logage) )
        # Chop off the last star because its the 30 mag weird thing that parsec does.
        local m_ini = table.Mini[good][begin:end-1]
        local iso_mags = [table.F090W[good][begin:end-1], table.F150W[good][begin:end-1]]
        push!(free_templates, partial_cmd_smooth( m_ini, iso_mags, [F090W_error, F150W_error], 2, [1,2], imf, [F090W_complete, F150W_complete]; 
                dmod=distmod, normalize_value=template_norm, edges=edges).weights )
        push!(free_template_logAge, logage)
        push!(free_template_MH, mh)
    end
end

In [71]:

# `templates` has 1846 entries, free_templates has ~10% of that.
length(free_templates)

Out[71]:

Let's start off by just obtaining the maximum likelihood estimate (MLE) via fit_templates_lbfgsb. We'll construct the initial guess vector x0 with construct_x0.

In [72]:

import StarFormationHistories: fit_templates_lbfgsb, construct_x0
lbfgsb_result = fit_templates_lbfgsb(free_templates, data; x0=construct_x0(free_template_logAge, T_max; normalize_value=template_norm))

RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =          142     M =           10

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
  142   4318   4425   4827     0    43   9.301D-06   1.738D+03

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            

 Total User time 1.728E+00 seconds.

Out[72]:

(1737.5617669589449, [0.0, 6.309221240969905, 0.3774519040929603, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 392.6911829473131, 1172.9042840660654, 139.87837936686887, 648.3252367693378, 300.29780211058176, 372.29568656728304, 514.5829920180571, 465.4210516330318, 325.8750298842604])

In [73]:

# Calculate the cumulative SFH, SFRs, <[M/H]>(logAge) from the above result
free_coeffs = lbfgsb_result[2] .* template_norm
unique_free_template_logAge, free_cum_sfr_arr, free_sfr_arr, free_mean_mh_arr = calculate_cum_sfr(free_coeffs, free_template_logAge, free_template_MH, T_max)

Out[73]:

([6.6, 6.65, 6.7, 6.75, 6.8, 6.85, 6.9, 6.95, 7.0, 7.05  …  9.65001, 9.70001, 9.75001, 9.80001, 9.85001, 9.90001, 9.95001, 10.00001, 10.05001, 10.10001], [1.0000000000000002, 0.9993739680443234, 0.999336515410366, 0.999336515410366, 0.999336515410366, 0.999336515410366, 0.9990821198597614, 0.9990821198597614, 0.9990821198597614, 0.9986779938595978  …  0.6456013189136356, 0.6456013189136356, 0.6161223056788163, 0.5212635956705851, 0.5017426707122999, 0.4298692956046791, 0.39090454546385706, 0.2606437942905101, 0.16651675173671107, 0.07851628822506478], [0.012988237983546648, 0.0006925260292140784, 0.0, 0.0, 0.0, 0.0029679955575222135, 0.0, 0.0, 0.003337878528053956, 0.00013656437982336654  …  0.0, 0.00048579906817496435, 0.0013932240083237553, 0.00025553131347383313, 0.000838516812506095, 0.00040514980204774565, 0.00120713909004342, 0.0007774243672446133, 0.000647781729964104, 0.0007125866141560192], [-0.5000166309399301, -0.60001623283198, -0.60001623283198, -0.60001623283198, -0.60001623283198, -0.5000166309399301, -0.5000166309399301, -0.5000166309399301, -0.5000166309399301, -0.5000166309399301  …  -0.8000106239270766, -1.0389870945459982, -1.042833704046062, -1.2000131046997948, -1.291372201894493, -1.2000131046997948, -1.389359303307148, -1.5683548786699604, -1.6419879799478247, -1.7588319600316495])

In [74]:

fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=false,figsize=(10,5))
fig.subplots_adjust(hspace=0.0,wspace=0.35)

axs[1].plot( exp10.(unique_free_template_logAge)./1e9, free_cum_sfr_arr, label="LBFGSB Fit" )
axs[1].plot( exp10.(unique_template_logAge)./1e9, cum_sfr_arr, label="Input SFH", ls="--" )

axs[1].set_xlim([13.0,-0.1])
axs[1].set_ylim([0.0,1.1])
axs[1].set_xlabel("Lookback Time [Gyr]")
axs[1].set_ylabel("Cumulative SF")
axs[1].legend()

axs[2].plot( exp10.(unique_free_template_logAge)./1e9, free_mean_mh_arr, label="LBFGSB Fit" )
axs[2].plot( exp10.(unique_template_logAge)./1e9, mean_mh_arr, label="Input SFH", ls="--" )
axs[2].set_xlabel("Lookback Time [Gyr]")
axs[2].set_ylabel(L"$\langle$[M/H]$\rangle$")

Out[74]:

PyObject Text(457.3650849792141, 0.5, '$\\langle$[M/H]$\\rangle$')

We can also use fit_templates, which uses BFGS to build a dense estimate of the inverse Hessian. This function is slower than fit_templates_lbfgsb above, but allows for estimation of random uncertainties.

In [75]:

import StarFormationHistories: fit_templates
bfgs_result = fit_templates(free_templates, data; x0=construct_x0(free_template_logAge, T_max; normalize_value=template_norm))

Out[75]:

(map = StarFormationHistories.LogTransformFTResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([0.6711597289310539, 2.7869798992616235, 0.6617579206189061, 0.8963766472321459, 0.49741326178208983, 0.6892953484325856, 0.4681920518786531, 0.5467529941647936, 0.5016994229523505, 0.4978142925508428  …  139.93393540620644, 203.8815721476802, 1077.116942224919, 239.99636631493289, 632.9659868554295, 333.89285764119256, 374.43089138085855, 516.1480680566715, 465.7771354059109, 328.51005657615923], [0.6398255866894301, 1.1277520383716344, 0.6306876934351933, 0.838298779219725, 0.48701329897373424, 0.6635250814782753, 0.4600502491491314, 0.5368197508581075, 0.4925265992378701, 0.48971778657192533  …  108.15321604843322, 126.88620363217886, 138.31621718821037, 142.68448609575, 69.45807213962969, 90.02224511542695, 37.90008092617126, 42.65475340429022, 39.00501498012438, 36.80885377132398], [0.9088065055460887 -0.11568958147759868 … 0.00028763446350277554 -0.0008085589057496431; -0.11568958147760107 0.1637418031594777 … -0.00046785775607621326 -0.0006903196421389435; … ; 0.0002876344635257862 -0.00046785775606924304 … 0.007012688802010941 -0.003992840437537446; -0.0008085589057456627 -0.0006903196421354701 … -0.003992840437114647 0.012554726302817983],  * Status: success

 * Candidate solution
    Final objective value:     1.424759e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 6.17e-10 ≰ 0.0e+00
    |x - x'|/|x'|          = 8.83e-11 ≰ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 7.76e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    377
    f(x) calls:    1093
    ∇f(x) calls:   1093
), mle = StarFormationHistories.LogTransformFTResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Optim.MultivariateOptimizationResults{Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}, Vector{Float64}, Float64, Float64, Vector{Optim.OptimizationState{Float64, Optim.BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Optim.Flat}}}, Bool, @NamedTuple{f_limit_reached::Bool, g_limit_reached::Bool, h_limit_reached::Bool, time_limit::Bool, callback::Bool, f_increased::Bool}}}([2.2689843676955086e-43, 6.3144518028626475, 0.37484712491649863, 1.917031400992052e-16, 1.7229419394882513e-46, 2.419433153293893e-9, 3.260794512432584e-54, 6.441893295746726e-36, 2.7780620869584285e-38, 7.09544917078298e-51  …  1.7755342272965846e-11, 393.50852661045354, 1173.576491563623, 138.4681278720116, 648.3035374785914, 300.95383010320063, 372.2570399654038, 514.6704462783881, 465.32304717158456, 325.8090482151131], [2.01014420467909e-37, 1.6480564060429816, 2.357563990956066, 6.244273141105193e-11, 1.628056956909957e-40, 0.0004170367984826, 3.607313296680295e-48, 4.6756867028343156e-30, 2.1501983331866163e-32, 7.361900346321837e-45  …  4.845169729781572e-6, 261.5289522140592, 164.35299943798051, 230.5386769935891, 72.44450426176901, 94.61175663275384, 40.87886225135171, 45.27283955194375, 38.93380670968161, 36.44965668179279], [7.848585988452405e11 587.1697507411828 … 44.6542929316753 47.677106455954416; 587.1697507410009 0.06811971746062678 … -0.00016488119018180365 -0.00043672157797538434; … ; 44.65429293172184 -0.00016488119018174884 … 0.007000750661122181 -0.0038622766169505995; 47.6771064559804 -0.0004367215779751659 … -0.0038622766169505653 0.012515857269240947],  * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     1.737581e+03

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 2.75e-02 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.83e-04 ≰ 0.0e+00
    |f(x) - f(x')|         = 4.55e-13 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 2.62e-16 ≰ 0.0e+00
    |g(x)|                 = 3.60e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    219
    f(x) calls:    683
    ∇f(x) calls:   683
))

bfgs_result.mle contains information for the maximum likelihood estimate; this is most similar to what is returned by fit_templates_lbfgsb but not exactly equal due to differences in the internal convergence criteria of the fitting methods.

In [76]:

bfgs_result.mle.μ .- lbfgsb_result[2]

Out[76]:

142-element Vector{Float64}:
  2.2689843676955086e-43
  0.00523056189274218
 -0.00260477917646168
  1.917031400992052e-16
  1.7229419394882513e-46
  2.419433153293893e-9
  3.260794512432584e-54
  6.441893295746726e-36
  2.7780620869584285e-38
  7.09544917078298e-51
  5.395976462961892e-41
  0.0012950217413494336
  1.1177945431752153e-63
  ⋮
  0.30551439383293655
 -1.2308142769772132
  1.7755342272965846e-11
  0.8173436631404343
  0.6722074975575651
 -1.4102514948572775
 -0.021699290746482802
  0.6560279926188741
 -0.038646601879236187
  0.08745426033101467
 -0.09800446144726038
 -0.06598166914733383

Of additional note are the uncertainty estimates, available as bfgs_result.map.σ and bfgs_result.mle.σ. These are essentially standard errors derived from the diagonal of the covariance matrix of the fitting parameters. The bfgs_result.map result contains the maximum a posteriori result and is often very comparable to the mode of posterior samples obtained via Hamiltonian Monte Carlo (provided by hmc_sample) or MCMC (provided by mcmc_sample). We don't generally recommend the use of bfgs_result.mle.σ, but provide it for completeness.

In [77]:

bfgs_result.map.μ

Out[77]:

142-element Vector{Float64}:
    0.6711597289310539
    2.7869798992616235
    0.6617579206189061
    0.8963766472321459
    0.49741326178208983
    0.6892953484325856
    0.4681920518786531
    0.5467529941647936
    0.5016994229523505
    0.4978142925508428
    0.4975510458750742
    0.637892213308068
    0.45948306349106915
    ⋮
  462.6826840944507
  172.96276963209834
  139.93393540620644
  203.8815721476802
 1077.116942224919
  239.99636631493289
  632.9659868554295
  333.89285764119256
  374.43089138085855
  516.1480680566715
  465.7771354059109
  328.51005657615923

In [78]:

bfgs_result.map.σ

Out[78]:

142-element Vector{Float64}:
   0.6398255866894301
   1.1277520383716344
   0.6306876934351933
   0.838298779219725
   0.48701329897373424
   0.6635250814782753
   0.4600502491491314
   0.5368197508581075
   0.4925265992378701
   0.48971778657192533
   0.48985142249933283
   0.6216224821768226
   0.45352326044880564
   ⋮
 178.2950428855977
 129.1008281466361
 108.15321604843322
 126.88620363217886
 138.31621718821037
 142.68448609575
  69.45807213962969
  90.02224511542695
  37.90008092617126
  42.65475340429022
  39.00501498012438
  36.80885377132398

Sampling from Inverse Hessian¶

We'll now draw samples from the posterior using the same approximation techinque as we did in the previous section.

In [79]:

hessian_sample = rand(bfgs_result.map, 10000) .* template_norm
# The above is effectively doing
# hessian_dist = Distributions.MvNormal(Optim.minimizer(bfgs_result.map.result),
#                         LinearAlgebra.Hermitian(bfgs_result.map.invH))
# hessian_sample = exp.(rand(hessian_dist, 10000)) .* template_norm

# Calculate the cumulative SFH for each sample
# and find the 1-σ quantile range for both the cumulative SFH and the SFRs
hessian_cum_sfr = Vector{Vector{Float64}}(undef,0)
hessian_sfr = Vector{Vector{Float64}}(undef,0)
for x in eachcol(hessian_sample)
    _, hessian_1, hessian_2, hessian_mh = calculate_cum_sfr(x, free_template_logAge, free_template_MH, T_max)
    push!(hessian_cum_sfr, hessian_1)
    push!(hessian_sfr, hessian_2)
end
hessian_cum_sfr = reduce(hcat, hessian_cum_sfr) 
hessian_sfr = reduce(hcat, hessian_sfr) 
# Now calculate quantiles
hessian_cum_lower = quantile.(eachrow(hessian_cum_sfr), 0.16)
hessian_cum_med = median.(eachrow(hessian_cum_sfr))
hessian_cum_upper = quantile.(eachrow(hessian_cum_sfr), 0.84)
hessian_sfr_lower = quantile.(eachrow(hessian_sfr), 0.16)
hessian_sfr_med = median.(eachrow(hessian_sfr))
hessian_sfr_upper = quantile.(eachrow(hessian_sfr), 0.84)

Out[79]:

71-element Vector{Float64}:
 0.011087344594003522
 0.006959549089189999
 0.004865645659841066
 0.003781668854806038
 0.0032689437725653557
 0.003349323723534193
 0.00249648632362979
 0.0025462691387392873
 0.0027453128516842277
 0.002229191878752969
 0.0017757246765453716
 0.0018005384189252965
 0.0017210896124459144
 ⋮
 0.0011731718093661045
 0.001274485831937575
 0.0004984981287895128
 0.0008517482604043832
 0.0014894730848585404
 0.0010665858680499842
 0.0010824492341152312
 0.000636179377288191
 0.0013852804917540817
 0.0008620774386505356
 0.0006907097150881543
 0.0007549629107942181

In [80]:

# Now plot cumulative SFH and MH evolution
fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=false,figsize=(10,6))
fig.subplots_adjust(hspace=0.0,wspace=0.35)

# axs[1].plot( exp10.(unique_free_template_logAge)./1e9, cum_sfr_arr, label="Result" )
axs[1].plot( exp10.(unique_free_template_logAge)./1e9, hessian_cum_med, label="Result" )
axs[1].plot( exp10.(unique_template_logAge)./1e9, cum_sfr_arr, label="Input SFH", ls="--" )
axs[1].fill_between( exp10.(unique_free_template_logAge)./1e9, hessian_cum_lower, hessian_cum_upper, alpha=0.3, fc="k")
axs[1].set_xlim([13.0,-0.1])
axs[1].set_ylim([0.0,1.1])
axs[1].set_xlabel("Lookback Time [Gyr]")
axs[1].set_ylabel("Cumulative SF")
axs[1].legend()

axs[2].plot( exp10.(unique_free_template_logAge)./1e9, free_mean_mh_arr, label="Result" )
axs[2].plot( exp10.(unique_template_logAge)./1e9, mean_mh_arr, label="Input", ls="--" )
# axs[2].plot( exp10.(unique_template_logAge)./1e9, met_func.(unique_template_logAge), c="k", label="Guess",ls="--")
axs[2].set_xlabel("Lookback Time [Gyr]")
axs[2].set_ylabel(L"$\langle$[M/H]$\rangle$")
axs[2].legend()

Out[80]:

PyObject <matplotlib.legend.Legend object at 0x7fc8bb984700>

We can alternatively use the Measurements.jl package to estimate the uncertainties on the SFRs assuming that the bfgs_result.map.σ are Gaussian and uncorrelated. In general we expect some correlation in the star formation rates between age bins (as shown above in the constrained metallicity case), but this approach can be useful for quick diagnostics.

In [81]:

import Measurements: uncertainty, ±
hessian_sfrerr = uncertainty.(calculate_cum_sfr(bfgs_result.map.μ .± bfgs_result.map.σ,
                                                            free_template_logAge, free_template_MH, T_max; normalize_value=template_norm)[3]) .* 1e3

Out[81]:

71-element Vector{Float64}:
 2.6692199225573705
 1.9247389480268136
 1.3458984684494002
 1.030343473158807
 0.9021542800518171
 0.9161985142653941
 0.6901358672296908
 0.688549708655883
 0.7590019432769353
 0.6048350997836405
 0.48129006079513115
 0.4903793281266798
 0.4778889098375988
 ⋮
 0.22101589024353976
 0.1761334479232895
 0.13774915711373256
 0.23229273530879294
 0.3018139978468668
 0.2997877393446496
 0.254823171209302
 0.17201456290374892
 0.18272979172660722
 0.09318314157157727
 0.041676983975795695
 0.04829636501758965

In [82]:

fig,ax1 = plt.subplots(figsize=(7,7))
# ax1.bar(unique_free_template_logAge[begin:end-1], free_sfr_arr[begin:end-1] .* 1e3; width=diff(unique_free_template_logAge), align="edge", yerr = hessian_sfrerr[begin:end-1], capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="LBFGS Fit")
ax1.bar(unique_free_template_logAge, hessian_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", yerr = hessian_sfrerr, capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="BFGS Fit")
ax1.bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)
ax1.set_xlabel("log(Age [yr])")
ax1.set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
ax1.set_ylim([0.0, ax1.get_ylim()[2]])
ax1.legend()

Out[82]:

PyObject <matplotlib.legend.Legend object at 0x7fc8bb50b0d0>

Overall the performance here is not bad, but inferior to the constrained AMR fit we performed previously.

If we make the same plot with the hessian_sfr_lower and hessian_sfr_upper vectors, which utilize the samples from the full variance-covariance matrix, we see that the uncertainties look very similar across most of the plot, but are not identical due to the inclusion of covariance between parameters.

In [83]:

fig,ax1 = plt.subplots(figsize=(7,7))
ax1.bar(unique_free_template_logAge, hessian_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(hessian_sfr_med .- hessian_sfr_lower) .* 1e3, 
            (hessian_sfr_upper .- hessian_sfr_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="BFGS Fit")
ax1.bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFH", alpha=0.5)
ax1.set_xlabel("log(Age [yr])")
ax1.set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
ax1.set_ylim([0.0, ax1.get_ylim()[2]])
ax1.legend()

Out[83]:

PyObject <matplotlib.legend.Legend object at 0x7fc8bbb74f40>

Sampling via MCMC¶

The unconstrained fitting methods are not optimal for MCMC. Since so many of the fitting coefficients are 0, our transition acceptance probabilities are extremely low, giving us fairly poor convergence unless we run very long chains.

In [84]:

import StarFormationHistories: mcmc_sample
import Distributions: MvNormal
import LinearAlgebra: I

nwalkers = 1000
nsteps = 2000
burn_idx = 200
# Walker initial positions sampled from a multivariate normal distribution with means equal to the BFGS MLE
# and identity covariance matrix. The `max` call makes sure none of the cofficients have initial conditions less than 0. 
mcmc_x0 = [max.(0.0, rand(MvNormal(bfgs_result.mle.μ, I))) for i in 1:nwalkers]
mcmc_result = mcmc_sample(free_templates, data, mcmc_x0, nwalkers, nsteps; nthin=5); # nthin=5 to only save every 5 steps

┌ Warning: ProgressMeter by default refresh meters with additional information in IJulia via `IJulia.clear_output`, which clears all outputs in the cell. 
│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
│  - To disable this warning message, do `ProgressMeter.ijulia_behavior(:clear)`.
└ @ ProgressMeter ~/.julia/packages/ProgressMeter/dMfiC/src/ProgressMeter.jl:594
emcee, niter=2000000, nwalkers=1000: 100%|█████████████████████████| Time: 0:00:32
  accept_ratio_mean:      0.00904
  accept_ratio_std:       0.00468
  accept_ratio_outliers:  44
  burnin_phase:           false

Note the very low transition acceptance ratio of <1%. If we plot the trace of the chains we can see that they have not reached their stationary distribution even after 2000 steps. MCMC can work well when considering fewer templates, or when fewer of the fitting coefficients are zero. However, this application is better suited to Hamiltonian Monte Carlo.

In [85]:

import StatsBase: fit, Histogram, normalize

fig, (ax0, ax1) = plt.subplots(nrows=1,ncols=2,sharex=false,sharey=false,figsize=(10,6))
fig.subplots_adjust(hspace=0.0,wspace=0.2)

ax0.axvline(burn_idx)
for i in 1:size(mcmc_result,3)
    ax0.plot(first(axes(mcmc_result)), mcmc_result[:,end,i], alpha=0.1, rasterized=true) 
end

ax1_hist = normalize(fit(Histogram, vec(mcmc_result[burn_idx:end,end,:]); nbins=100, closed=:left); mode=:probability) #:pdf, :density, :probability, :none
ax1.bar(first(ax1_hist.edges)[2:end], ax1_hist.weights, width=step(first(ax1_hist.edges)), ec="k"); # fc="k"

Sampling via Hamiltonian Monte Carlo¶

Let us now compare to hmc_sample, which utilizes the No-U-Turn sampler variant of Hamiltonian Monte Carlo implemented in DynamicHMC.jl to sample from the posterior. This is a more rigorous way to obtain posterior samples than using the approximation of fit_templates and can be more efficient than sample_mcmc for high-dimensional problems or when many of the coefficients are 0. However, we will show that the approximation of fit_templates still compares favorably with this more accurate method.

In [86]:

import DynamicHMC
import StarFormationHistories: hmc_sample
mc_result = hmc_sample(free_templates, data, 1000; reporter=DynamicHMC.ProgressMeterReport());

Warmup: 100%|███████████████████████████████████████████| Time: 0:00:36
Warmup: 100%|███████████████████████████████████████████| Time: 0:00:13
Warmup: 100%|███████████████████████████████████████████| Time: 0:00:18
Warmup: 100%|███████████████████████████████████████████| Time: 0:00:02
MCMC: 100%|█████████████████████████████████████████████| Time: 0:00:47

We can check the summary statistics to ensure the samples have a high acceptance rate and are converging properly (turning % should be ~100%).

In [87]:

DynamicHMC.Diagnostics.summarize_tree_statistics(mc_result.tree_statistics)

Out[87]:

Hamiltonian Monte Carlo sample of length 1000
  acceptance rate mean: 0.92, 5/25/50/75/95%: 0.73 0.88 0.95 0.98 1.0
  termination: divergence => 0%, max_depth => 0%, turning => 100%
  depth: 0 => 0%, 1 => 0%, 2 => 0%, 3 => 0%, 4 => 0%, 5 => 0%, 6 => 0%, 7 => 100%

The parameters optimized by full_templates and hmc_sample are actually the logarithms of the coefficients, not the coefficients themselves, so we need to transform the samples here.

In [88]:

mc_matrix = exp.(mc_result.posterior_matrix) .* template_norm

Out[88]:

142×1000 Matrix{Float64}:
     96.3725        421.121      …    73.2206        63.7291
   2239.11         4076.72          1295.11        1880.44
     66.0671        893.285          205.897       2233.54
   2667.05           69.2351        2283.2          682.311
    409.332         139.381          296.755        745.094
    718.071          62.1846     …  1288.15         328.395
     31.3383       1314.45           382.899        451.583
    979.63           29.2092         755.307        217.589
    188.283         376.428          814.598        172.491
    126.696        1122.53           606.123        133.953
    105.018        1196.48       …   174.893        994.785
   1340.08          128.147          344.691        686.584
     89.1731        823.166          201.944        812.642
      ⋮                          ⋱                
      3.88147e5       2.82917e5  …     4.06886e5      2.71327e5
      1.94571e5       1.63269e5     8782.75           2.54566e5
      1.65551e5       1.15106e5        1.58252e5      3.27765e5
      3.00481e5   93507.6              3.19237e5      2.75725e5
      1.12478e6       1.03898e6        1.07204e6      1.15493e6
      1.45525e5       4.16955e5  …     2.0602e5   23108.4
      4.30326e5       8.29407e5        5.8902e5       5.38183e5
      4.35577e5       1.68287e5        3.9299e5       4.07504e5
      3.98381e5       3.47668e5        3.68688e5      3.67446e5
      5.26463e5  491665.0              4.6532e5       5.84517e5
 413999.0             5.47612e5  …     4.50939e5      5.03966e5
 402115.0             2.63546e5        3.81213e5      3.21163e5

Calculate statistics from the MC samples.

In [89]:

# Calculate the cumulative SFH for each sample in the chain
# and find the 1-σ range
mc_cum_sfr = Vector{Vector{Float64}}(undef,0)
mc_sfr = Vector{Vector{Float64}}(undef,0)
for x in eachcol(mc_matrix)
    _, mc_1, mc_2, mc_mh = calculate_cum_sfr(x, free_template_logAge, free_template_MH, T_max)
    push!(mc_cum_sfr, mc_1)
    push!(mc_sfr, mc_2)
end
mc_cum_sfr = reduce(hcat, mc_cum_sfr) # hcat( mc_cum_sfr... )
mc_sfr = reduce(hcat, mc_sfr) # hcat( mc_sfr... )
# Calculate quantiles
mc_cum_lower = quantile.(eachrow(mc_cum_sfr), 0.16)
mc_cum_med = median.(eachrow(mc_cum_sfr))
mc_cum_upper = quantile.(eachrow(mc_cum_sfr), 0.84)
mc_sfr_lower = quantile.(eachrow(mc_sfr), 0.16)
mc_sfr_med = median.(eachrow(mc_sfr))
mc_sfr_upper = quantile.(eachrow(mc_sfr), 0.84)

Out[89]:

71-element Vector{Float64}:
 0.00912687264156707
 0.004790017218165018
 0.003329436130689708
 0.0024272611543099056
 0.002064780977697943
 0.002183505582629635
 0.0017044864398530755
 0.0016733540459101148
 0.0017417804049954448
 0.0013784508728543429
 0.0011399449910505755
 0.0011888058743396179
 0.0011599682308532113
 ⋮
 0.0011200944160527624
 0.0012330504803870226
 0.00035610367957806844
 0.0006518043827883646
 0.001369033471843747
 0.0008943038856546006
 0.0009242184964425634
 0.0004982006905727934
 0.0013359256244834384
 0.000840998544992007
 0.0006874898663405812
 0.0007556701845643837

And we'll plot the cumulative SFHs:

In [90]:

fig,axs=plt.subplots(nrows=1,ncols=2,sharex=true,sharey=true,figsize=(10,6))
fig.subplots_adjust(hspace=0.0,wspace=0.1)

axs[1].plot( exp10.(unique_free_template_logAge)./1e9, mc_cum_med, label="HMC Median" )
axs[1].plot( exp10.(unique_template_logAge)./1e9, cum_sfr_arr, label="Input SFH", ls="--" )
axs[1].fill_between( exp10.(unique_free_template_logAge)./1e9, mc_cum_lower, mc_cum_upper, alpha=0.3, fc="k") 

axs[1].set_xlim([13.0,-0.1])
axs[1].set_ylim([0.0,1.1])
axs[1].set_xlabel("Lookback Time [Gyr]")
axs[1].set_ylabel("Cumulative SF")
axs[1].legend()

axs[2].plot( exp10.(unique_free_template_logAge)./1e9, hessian_cum_med, label="BFGS Result" )
axs[2].plot( exp10.(unique_template_logAge)./1e9, cum_sfr_arr, label="Input SFH", ls="--" )
axs[2].fill_between( exp10.(unique_free_template_logAge)./1e9, hessian_cum_lower, hessian_cum_upper, alpha=0.3, fc="k") 
axs[2].set_xlabel("Lookback Time [Gyr]")
axs[2].legend()

Out[90]:

PyObject <matplotlib.legend.Legend object at 0x7fc8b9e7be20>

The differences are more obvious in the SFRs,

In [91]:

fig,axs = plt.subplots(nrows=1,ncols=2,sharex=true,sharey=true,figsize=(14,7))
fig.subplots_adjust(hspace=0.0,wspace=0.1)

axs[1].bar(unique_free_template_logAge, mc_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(mc_sfr_med .- mc_sfr_lower) .* 1e3, 
        (mc_sfr_upper .- mc_sfr_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="HMC Result")
axs[1].bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)

axs[1].set_xlabel("log(Age [yr])")
axs[1].set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
axs[1].legend()

axs[2].bar(unique_free_template_logAge, hessian_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
    yerr = [(hessian_sfr_med .- hessian_sfr_lower) .* 1e3, 
            (hessian_sfr_upper .- hessian_sfr_med) .* 1e3], 
    capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="BFGS Fit")
axs[2].bar(unique_template_logAge, sfr_arr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="Input SFRs", alpha=0.5)
axs[2].legend()
axs[2].set_xlabel("log(Age [yr])")

Out[91]:

PyObject Text(0.5, 29.371800193718016, 'log(Age [yr])')

In [92]:

# _, _, tmp_sfr, _ = calculate_cum_sfr(bfgs_result.map.μ .* template_norm, free_template_logAge, free_template_MH)

fig,ax1 = plt.subplots(figsize=(7,7))

ax1.bar(unique_free_template_logAge, mc_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="HMC Result")
# ax1.bar(unique_free_template_logAge, hessian_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="BFGS Fit", alpha=0.5) 
ax1.bar(unique_free_template_logAge, hessian_sfr_med .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", 
        yerr = [(hessian_sfr_med .- hessian_sfr_lower) .* 1e3, 
                (hessian_sfr_upper .- hessian_sfr_med) .* 1e3], 
        capsize=3, error_kw=Dict("elinewidth"=>1,"capthick"=>1), label="BFGS Fit", alpha=0.5)
# ax1.bar(unique_free_template_logAge, tmp_sfr .* 1e3; width=diff(vcat(unique_template_logAge,max_logAge)), align="edge", label="BFGS Fit", alpha=0.5) 

ax1.set_xlabel("log(Age [yr])")
ax1.set_ylabel(L"SFR [$10^{-3}$ M$_\odot$ / yr]")
ax1.set_ylim([0.0, ax1.get_ylim()[2]])
ax1.legend()

Out[92]:

PyObject <matplotlib.legend.Legend object at 0x7fc8b9ea2590>

Thus the approximations we made above when deriving the uncertainties from the BFGS results that

the posterior is well-approximated by a multivariate Gaussian in the fitting variables θ = log(coeffs) in the vicinity of the maximum a posteriori and
the inverse Hessian approximation produced by the BFGS optimization is a good estimate of the variance-covariance matrix of said Gaussian

are not fully correct, but they are good enough in most cases that they can produce very similar posterior samples to the full HMC analysis at a fraction of the computational cost (a few seconds compared to minutes or hours).

In [93]:

GC.gc() # Explicit garbage collection when finished

In [ ]: