The following notebook shows how to adjust parameters to modify debiasers.
For in-depth information about the bias adjustment methods and their usage refer to the documentation that can be found under - API reference -> debias-module.
import numpy as np
import matplotlib.pyplot as plt
We start by reading in and preprocessing some data. For an explanation of the steps please refer to the 01 Getting started notebook.
import numpy as np
def get_data(variable, data_path = "testing_data/"):
# Load in the data
data = np.load(f"{data_path}{variable}.npz", allow_pickle = True)
# Return arrays
return data["obs"], data["cm_hist"], data["cm_future"], data["time_obs"], data["time_cm_hist"], data["time_cm_future"]
We work with daily mean near-surface temperature ('tas') and precipitation flux ('pr') in this notebook. Lets get the testing data for these two variables:
tas_obs, tas_cm_hist, tas_cm_future, tas_time_obs, tas_time_cm_hist, tas_time_cm_future = get_data("tas")
pr_obs, pr_cm_hist, pr_cm_future, pr_time_obs, pr_time_cm_hist, pr_time_cm_future = get_data("tas")
As shown in the 01 Getting Started notebook, each debiaser is a subclass of the abstract Debiaser
class. This provides a unified interface for initialising and applying debiasers. Let's read in some temperature data:
We can initialise a debiaser for "tas"
by using the from_variable
method. If we want to apply ISIMIP and Equidistant-CDF-Matching (ECDFM) we can write:
from ibicus.debias import ECDFM, ISIMIP, DeltaChange
tas_debiaser_ECDFM = ECDFM.from_variable("tas")
tas_debiaser_ISIMIP = ISIMIP.from_variable("tas")
Applying the debiasers works using the apply
-classmethod.
tas_ECDFM = tas_debiaser_ECDFM.apply(tas_obs, tas_cm_hist, tas_cm_future)
37%|█████████████████████████████████████████████████████████████▌ | 84/225 [00:46<01:15, 1.87it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 39%|███████████████████████████████████████████████████████████████▊ | 87/225 [00:48<01:13, 1.87it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 39%|████████████████████████████████████████████████████████████████▌ | 88/225 [00:48<01:12, 1.90it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 41%|████████████████████████████████████████████████████████████████████▏ | 93/225 [00:51<01:06, 1.99it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 51%|███████████████████████████████████████████████████████████████████████████████████▊ | 115/225 [01:03<01:03, 1.72it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 57%|█████████████████████████████████████████████████████████████████████████████████████████████▎ | 128/225 [01:10<00:49, 1.95it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 139/225 [01:17<00:53, 1.59it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 66%|███████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 148/225 [01:23<00:56, 1.36it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 68%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 152/225 [01:26<00:48, 1.50it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 72%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 161/225 [01:31<00:35, 1.79it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 168/225 [01:35<00:31, 1.84it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 76%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 170/225 [01:36<00:28, 1.95it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 176/225 [01:39<00:27, 1.81it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 79%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 177/225 [01:40<00:25, 1.85it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 182/225 [01:42<00:22, 1.91it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 82%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 185/225 [01:44<00:19, 2.10it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 188/225 [01:45<00:19, 1.94it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 197/225 [01:50<00:13, 2.13it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/optimize/_minpack_py.py:175: RuntimeWarning: The iteration is not making good progress, as measured by the improvement from the last ten iterations. warnings.warn(msg, RuntimeWarning) 88%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 199/225 [01:51<00:11, 2.29it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/optimize/_minpack_py.py:175: RuntimeWarning: The iteration is not making good progress, as measured by the improvement from the last ten iterations. warnings.warn(msg, RuntimeWarning) 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 200/225 [01:51<00:11, 2.23it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 204/225 [01:53<00:10, 2.04it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 93%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 209/225 [01:56<00:07, 2.11it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 95%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 214/225 [01:58<00:04, 2.32it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 96%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 216/225 [01:58<00:03, 2.64it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 222/225 [02:01<00:01, 2.03it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/scipy/stats/_continuous_distns.py:684: RuntimeWarning: invalid value encountered in sqrt sk = 2*(b-a)*np.sqrt(a + b + 1) / (a + b + 2) / np.sqrt(a*b) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 225/225 [02:03<00:00, 1.82it/s]
Some debiasers like ISIMIP require additional arguments in the apply-method like the dates:
tas_ISIMIP = tas_debiaser_ISIMIP.apply(tas_obs, tas_cm_hist, tas_cm_future, time_obs = tas_time_obs, time_cm_hist = tas_time_cm_hist, time_cm_future =tas_time_cm_future)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 225/225 [10:18<00:00, 2.75s/it]
We can compare the bias corrected output using the evaluation framework. In this notebook, we limit ourselves to plotting the distribution of the debiased data at location [1,1]:
plt.hist(tas_ECDFM[:, 1, 1], bins="auto", alpha = 0.5, label = "ECDFM")
plt.hist(tas_ISIMIP[:, 1, 1], bins="auto", alpha = 0.5, label = "ISIMIP")
plt.legend()
plt.show()
We see that the data distributions produced by ECDFM and ISIMIP do differ. For info on how to evaluate this have a look at our notebook on evaluation
Every debiaser that is part of the package consists of a set of parameters controlling the behavior as well as a from_variable
and apply
-method.
Let's have a look at the tas-ECDFM debiaser defined above using the from_variable
method.
tas_debiaser_ECDFM
ECDFM(variable='Daily mean near-surface air temperature', reasonable_physical_range=[100, 400], distribution=<scipy.stats._continuous_distns.beta_gen object at 0x7f017faa6340>, cdf_threshold=1e-10)
We see that two parameters are set: a variable referring to the variable we are debiasing, as well as a distribution. These parameters fully determine the setting of this particular debiaser, equidistant-CDF-matching.
Using the from_variable
method, a debiaser is initialized using the default parameters associated with this variable for this particular debiaser. The same outcome can be achieved by directly setting the required parameters, without using the from_variable
method:
import scipy.stats
tas_debiaser_ECDFM_v2 = ECDFM(distribution=scipy.stats.beta)
And we can see that these debiasers are absolutely identical:
tas_debiaser_ECDFM == tas_debiaser_ECDFM_v2
True
Learning: if we set all the parameters needed manually to the default parameters of a specific variable, the debiaser will be equivalent to a debiaser initialized using the from_variable
method.
Have a look at the documentation of each bias adjustment method to find out which parameters need to be set for that specific debiaser.
The table below gives an overview of which variables currently have default setting for which debiasers - 'experimental default settings' are marked with brackets around the x.
Variable | LinearScaling |
DeltaChange |
QuantileMapping |
ScaledDistributionMapping |
CDFt |
ECDFM |
QuantileDeltaMapping |
ISIMIP |
---|---|---|---|---|---|---|---|---|
hurs | (x) | (x) | (x) | (x) | (x) | (x) | x | |
pr | x | x | x | x | x | x | x | x |
prsnratio | x | |||||||
psl | (x) | (x) | (x) | (x) | (x) | (x) | x | |
rlds | (x) | (x) | (x) | (x) | (x) | (x) | x | |
rsds | (x) | (x) | (x) | x | ||||
sfcWind | (x) | (x) | (x) | (x) | (x) | (x) | x | |
tas | x | x | x | x | x | x | x | x |
tasmin | x | x | (x) | (x) | x | (x) | (x) | |
tasmax | x | x | (x) | (x) | x | (x) | (x) | |
tasrange | (x) | x | ||||||
tasskew | (x) | x |
If we try to initialize a debiaser that does not exist for a certain variable using the from_variable
method, we are basically asking for error message:
debiaser3 = ISIMIP.from_variable("tasmin")
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) /tmp/ipykernel_49752/2091483589.py in <module> ----> 1 debiaser3 = ISIMIP.from_variable("tasmin") ~/anaconda3/lib/python3.9/site-packages/ibicus/debias/_isimip.py in from_variable(cls, variable, **kwargs) 341 @classmethod 342 def from_variable(cls, variable: Union[str, Variable], **kwargs): --> 343 return super()._from_variable( 344 cls, 345 variable=variable, ~/anaconda3/lib/python3.9/site-packages/ibicus/debias/_debiaser.py in _from_variable(child_class, variable, default_settings_variable, experimental_default_setting_variable, default_settings_general, **kwargs) 137 ] 138 else: --> 139 raise ValueError( 140 f"Unfortunately currently no default settings exist for the variable {variable} in the debiaser {child_class.__name__}. You can set the required class parameters manually by using the class constructor." 141 ) ValueError: Unfortunately currently no default settings exist for the variable tasmin in the debiaser ISIMIP. You can set the required class parameters manually by using the class constructor.
ISIMIP instead offers the option to debias tasrange
and tasskew
and calculate tasmin
from those.
Some bias adjustment methods offer and additional for_precipitation
method to initialise it for precipitation (pr
). Precipitation methods can be a bit more complicated and sometimes require the specification of a threshold under which precipitation is assumed to be zero. The for_precipitation
-method is there to facilitate the choice of method.
For example we can initialise a QuantileMapping
debiaser with a precipitation gamma hurdle model. A hurdle model is a two step model where precipitation occurrence is modelled binomially and then a gamma distribution is assumed for the amounts. An alternative model is the censored model where all precipitation amounts under a threshold (so also all dry days) are assumed censored, so labeled 'not known' to the model.
Let's initialize both:
from ibicus.debias import QuantileMapping
# Initialise debiaser
pr_debiaser_QM_hurdle = QuantileMapping.for_precipitation(model_type = "hurdle")
pr_debiaser_QM_censored = QuantileMapping.for_precipitation(model_type = "censored", censoring_threshold = 0.1/86400)
The bias adjustment method is applied through the apply-function. Lets initialise and apply a DeltaChange
debiaser for "tas"
:
tas_debiaser_DC = DeltaChange.from_variable("tas")
tas_debiased_DC = tas_debiaser_DC.apply(tas_obs, tas_cm_hist, tas_cm_future)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 225/225 [00:00<00:00, 4552.27it/s]
As you can see, the apply function needs three numpy arrays of data in the form:
All three are assumed to be 3d-numpy arrays where the first dimension corresponds to time and the other two to spatial locations. The locations in obs, cm_hist and cm_future need to be the same and observational data has to be regridded to match the scale of the climate model grid prior to applying the bias adjustment.
Besides obs, cm_hist and cm_future some debiasers might also require additional information like the dates to which observations and climate model values correspond, for example to apply the debiaser in a running window-mode. We have already seen an example in the first chapter of this notebook, when we initialized and applied ISIMIP. Dates datasets are arrays of dates in one of several formats:
tas_time_obs
array([cftime.DatetimeGregorian(1979, 1, 1, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeGregorian(1979, 1, 2, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeGregorian(1979, 1, 3, 0, 0, 0, 0, has_year_zero=False), ..., cftime.DatetimeGregorian(2005, 12, 29, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeGregorian(2005, 12, 30, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeGregorian(2005, 12, 31, 0, 0, 0, 0, has_year_zero=False)], dtype=object)
ISIMIP also runs without having been passed date arguments, by inference:
tas_ISIMIP_nodates = tas_debiaser_ISIMIP.apply(tas_obs, tas_cm_hist, tas_cm_future)
0%| | 0/225 [00:00<?, ?it/s]/home/jakobwes/anaconda3/lib/python3.9/site-packages/ibicus/debias/_debiaser.py:387: UserWarning: ISIMIP runs without time-information for at least one of obs, cm_hist or cm_future. This information is inferred, assuming the first observation is on a January 1st. Observations are chunked according to the assumed time information. This might lead to slight numerical differences to the run with time information, however the debiasing is not fundamentally changed. return func(obs, cm_hist, cm_future, **kwargs) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 225/225 [09:36<00:00, 2.56s/it]
debiaser = ISIMIP.from_variable("tas")
tas_ISIMIP_nodates = debiaser.apply(tas_obs, tas_cm_hist, tas_cm_future)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 225/225 [09:23<00:00, 2.50s/it]
As seen above it is possible to initialise a debiaser either using the from_variable
-method or the class-constructor. Using the class constructor, it is possible to extent debiasers to meteorological variables which are currently not covered by the from_variable
-method.
However, in many cases, even when working with variables that have default settings, it will be important to modify the behavior of debiasers to improve its application for a certain variable/region and problem at hand.
For parametric methods, we might want to change the distribution to improve fit eg. for extremes, we might want to use a different trend-capturing method if trends are unrealistic, modified or inflated, or just generally might want to do some additional adjustment the debiaser offers.
Parameters of the debiaser can be modified either by setting them differently in the class constructor or in the from_variable
-method. For example if we decide to apply QuantileMapping
and the climate change trend of the model is not entirely realistic it might be a good idea not to use detrending prior to quantile mapping. This can be done through:
tas_debiaser_QM_v1 = QuantileMapping.from_variable("tas", detrending = "no_detrending")
or:
tas_debiaser_QM_v2 = QuantileMapping(distribution = scipy.stats.norm, detrending = "no_detrending")
tas_debiaser_QM_v1 == tas_debiaser_QM_v2
True
It is also possible to directly modify the class attribute:
tas_debiaser_QM_v3 = QuantileMapping.from_variable("tas")
tas_debiaser_QM_v3.detrending = "no_detrending"
tas_debiaser_QM_v1 == tas_debiaser_QM_v3
True
However, no validation is applied to the new arguments, so this method is generally not recommended.
How to know which parameters to change: Some debiasers such as ISIMIP have many parameters to control the behavior, but oftentimes only a few are central. The documentation for each of the debiasers provides an indication of the central and required parameters and how they modify the debiasing behavior.