Uncertainty propagation

This notebook demonstrates uncertainty propagation with PyCO2SYS.

In this example, we imagine that we've measured dissolved inorganic carbon (DIC) and Total scale pH for four seawater samples. The DIC and pH values are different from each other, but all other conditions are the same for the samples. We want to know the seawater pCO2 in the samples.

The instructions follow the docs for the Pythonic pyco2.sys interface.

The normal way to run PyCO2SYS (without uncertainties) is as follows:

In [ ]:
import numpy as np
import PyCO2SYS as pyco2

# Set up CO2SYS inputs
pyco2_kws = dict(
    par1 = np.array([2150.1, 2175.3, 2199.8, 2220.4]),  # dissolved inorganic carbon in μmol/kg
    par2 = np.array([ 8.121,  8.082,  8.041,  8.001]),  # pH on the Total scale
    par1_type = 2,
    par2_type = 3,
    salinity = 33.1,
    temperature = 25.0,
    temperature_out = 25.0,
    pressure = 0.0,
    pressure_out = 0.0,
    total_silicate = 5.0,
    total_phosphate = 1.0,
    total_ammonia = 0.5,
    opt_pH_scale = 1,
    opt_k_carbonic = 12,
)
# Run PyCO2SYS and print out the pCO2 values
results = pyco2.sys(**pyco2_kws)
print("pCO2: ", results["pCO2"])

Uncertainties from measurements

Constant uncertainties

First, we want to know how the uncertainties in our DIC and pH measurements propagate into calculated pCO2. Let's assume that the uncertainties in DIC and pH are (1) independent from each other and (2) constant for all measurements, with a one-standard-deviation uncertainty of 3 μmol/kg for DIC, and 0.005 for pH.

We need to assemble:

  1. A list of parameters that we want to propagate uncertainties into: uncertainty_into.
  2. A dict of parameters that we want to propagate uncertaintes from, and their uncertainties: uncertainty_from.

For both of these, the entries in the list (or keys in the dict) come from the corresponding fields in the results dict.

The uncertainties must be independent from each other and given in terms of standard deviations for the maths to work!

To do the uncertainty propogation, we need to include uncertainty_into and uncertainty_from as kwargs to pyco2.sys (along with all the normal kwargs for the calculations):

In [ ]:
# Which parameter(s) do we want to propagate uncertainties into?
uncertainty_into = ["pCO2"]

# Define measurement uncertainties
uncertainty_from = {
    "par1": 3,     # uncertainty in dissolved inorganic carbon in μmol/kg
    "par2": 0.005, # uncertainty in pH
}

# Propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Uncertainty in pCO2 from par1 (i.e. DIC):     ", results["u_pCO2__par1"])
print("Uncertainty in pCO2 from par2 (i.e. pH):      ", results["u_pCO2__par2"])
print("Total uncertainty in pCO2 from par1 and par2: ", results["u_pCO2"])

The results dict now contains additional keys corresponding to the elements of uncertainty_into. The values with keys "u_<result>" are the total uncertainty in the corresponding result. For example, results["u_pCO2"] contains the total uncertainty in calculated pCO2 at input conditions (i.e. "pCO2").

The component of uncertainty from each argument is also provided. For example, result["u_pCO2__par1"] contains the uncertainty in calculated pCO2 at input conditions (i.e. "pCO2") resulting from uncertainty in par1 (here, DIC).

We can see from the results above that the uncertainty in pH dominates the overall uncertainty in pCO2.

Different uncertainties for different samples

Now imagine that the first pH measurement was done with a more accurate method than the others, so its uncertainty is smaller. We can account for this by using an array instead of a single value in the uncertainty_from dict.

You could also use this approach if "par1" and/or "par2" contained a mixture of different parameter types, with different uncertainties.

In [ ]:
# Redefine measurement uncertainties
uncertainty_from = {
    "par1": 3,
    "par2": np.array([0.001, 0.005, 0.005, 0.005]),
}

# Re-propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Uncertainty in pCO2 from par1 (i.e. DIC):     ", results["u_pCO2__par1"])
print("Uncertainty in pCO2 from par2 (i.e. pH):      ", results["u_pCO2__par2"])
print("Total uncertainty in pCO2 from par1 and par2: ", results["u_pCO2"])

The lower uncertainty in the first pH measurement has significantly reduced the total uncertainty in calculated pCO2.

Other function inputs

We can include any of the non-settings arguments of pyco2.sys by extending the uncertainties_from dict in the same way. For example, to also get the uncertainty in aragonite saturation state, and to also propagate uncertainties in the temperature (of 0.05 °C) and salinity (of 0.002) values:

In [ ]:
# Redefine which parameter(s) we want to propagate uncertainties into
uncertainty_into = ["pCO2", "saturation_aragonite"]

# Redefine measurement uncertainties
uncertainty_from = {
    "par1": 3,
    "par2": np.array([0.001, 0.005, 0.005, 0.005]),
    "temperature": 0.05,
    "salinity": 0.002,
}

# Re-propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Uncertainty in pCO2 from temperature:                 ", results["u_pCO2__temperature"])
print("Uncertainty in Omega-aragonite from salinity:         ", results["u_saturation_aragonite__salinity"])
print("Total uncertainty in Omega-aragonite from all inputs: ", results["u_saturation_aragonite"])

Uncertainties from internal parameters

We can also determine how uncertainties in PyCO2SYS's internally evaluated parameters (totals from salinity and equilibrium constants) propagate through to the calculated results. The approach is very similar to that used above: we just need to find the name of the variable we want to propagate uncertainty from in the results dict.

Absolute uncertainty known

First, imagine that we know the absolute uncertainty in an equilibrium constant. Let's say we think the uncertainty in K1 (first dissociation constant of carbonic acid) is on the order of 10-8 (as a standard deviation). The approach is now virtually identical to that used above:

In [ ]:
# Add K1 uncertainty to the dict
uncertainty_from.update({"k_carbonic_1": 1e-8})

# Re-propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Uncertainty in pCO2 from K1:               ", results["u_pCO2__k_carbonic_1"])
print("Total uncertainty in pCO2 from all inputs: ", results["u_pCO2"])
print("Uncertainty in Omega-aragonite from K1:               ", results["u_saturation_aragonite__k_carbonic_1"])
print("Total uncertainty in Omega-aragonite from all inputs: ", results["u_saturation_aragonite"])

Above we see that this uncertainty in K1 would dominate the total uncertainty in pCO2, but be less important for the aragonite saturation state.

Fractional uncertainty known

If you know the fractional uncertainty but not an absolute value, then you could just extract the relevant parameter values from the results dict and multiply through. For example, if we thought the uncertainty in K2 (second dissociation constant of carbonic acid) was on the order of 5%:

In [ ]:
# Get K2 values from the results
k2 = results["k_carbonic_2"]

# Add K2 uncertainty to the dict
uncertainty_from.update({"k_carbonic_2": k2 * 0.05})

# Re-propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Uncertainty in pCO2 from K2:               ", results["u_pCO2__k_carbonic_2"])
print("Total uncertainty in pCO2 from all inputs: ", results["u_pCO2"])
print("Uncertainty in Omega-aragonite from K2:               ", results["u_saturation_aragonite__k_carbonic_2"])
print("Total uncertainty in Omega-aragonite from all inputs: ", results["u_saturation_aragonite"])

In this example, we can see that the K2 uncertainty has a much bigger influence on aragonite saturation state than K1 did.

Uncertainty in pK values, not K

If uncertainties in equilibrium constants are given in terms of pK values rather than K we can simply append p to the start of the corresponding parameter name in uncertainties_from:

In [ ]:
# Redefine measurement uncertainties
uncertainty_from = {
    "par1": 3,
    "par2": np.array([0.001, 0.005, 0.005, 0.005]),
    "temperature": 0.05,
    "salinity": 0.002,
    "pk_carbonic_1": 0.0075,
    "pk_carbonic_2": 0.015,
}

# Re-propagate and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, 
)
print("Total uncertainty in pCO2 from all inputs: ", results["u_pCO2"])
print("Total uncertainty in Omega-aragonite from all inputs: ", results["u_saturation_aragonite"])

Uncertainties of Orr et al. (2018)

The uncertainties in equilibrium constants suggested by Orr et al. (2018) are available as a dict under pyco2.uncertainty_OEDG18:

In [ ]:
# Propagate OEDG18 uncertainties and print out results
results = pyco2.sys(**pyco2_kws,
    uncertainty_into=["saturation_aragonite"], uncertainty_from=pyco2.uncertainty_OEDG18, 
)
print("Total uncertainty in Omega-aragonite from all inputs: ", results["u_saturation_aragonite"])

Covarying uncertainties

If your uncertainties are covarying then you need to assemble a variance-covariance matrix of the uncertainties and a Jacobian matrix for the parameters that you are interested in. The maths of how to do this are beyond the scope of this tutorial, but you can use PyCO2SYS to get all the derivatives that you need. Instead of using the uncertainty_from and uncertainty_into kwargs, we use grads_wrt (w.r.t. = with respect to) and grads_of instead:

In [ ]:
# Get raw derivatives
results = pyco2.sys(**pyco2_kws,
    grads_of=uncertainty_into, grads_wrt=uncertainty_from, 
)

# Print out example results
print("d(pCO2)/d(par1) =", results["d_pCO2__d_par1"])
print("d(saturation_aragonite)/d(pk_carbonic_1) =", results["d_saturation_aragonite__d_pk_carbonic_1"])

The derivative of each result in grads_of with respect to each argument in grads_wrt is stored in the results dict under the key "d_<result>__d_<argument>".