#!/usr/bin/env python # coding: utf-8 # # Showing the consistency of the implementation of ISIMIP within ibicus with the reference implementation # This notebook is meant to demonstrate the consistency of the ibicus implementation of ISIMIP3BASD v3.0.1 (most recent version, in the following referred to as ISIMIP) with the reference implementation provided by Lange 2022: # # - Lange, S. (2022). ISIMIP3BASD (3.0.1) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.6758997 # # The code in the ibicus implementation is based upon [Lange 2019](https://doi.org/10.5194/gmd-12-3055-2019) and the [ISIMIP3b bias adjustment fact sheet](https://www.isimip.org/documents/413/ISIMIP3b_bias_adjustment_fact_sheet_Gnsz7CO.pdf). More details of the implementation of ISIMIP within ibicus can be found in the [documentation](https://ibicus.readthedocs.io/en/latest/reference/debias.html#package-name-debias-isimip-class). # # Numerous checks were conducted during and after the development process to ensure consistency. This notebook provides a final validation and demonstration of this consistency on the testing data published with the ISIMIP reference implementation. # # The structure of this notebook is as follows. First the ISIMIP reference implementation is pulled from https://zenodo.org/record/6758997/files/isimip3basd-master.tar.gz and run on the testing data. In a second step debiasers are initialised using ibicus and run for all variables included in ISIMIP. Consistency is demonstrated by plotting and comparing the values computed by the reference implementation and the ibicus implementation. # # # ## Contents # 1. [ISIMIP3BASD v3.0.1 reference implementation](#bullet-1) # # 2. [ibicus implementation](#bullet-2) # # 2.1. [Helpers](#bullet-21) # # 2.2. [tas](#bullet-22) # # 2.3. [pr](#bullet-22) # # 2.4. [ps / psl](#bullet-22) # # 2.5. [rlds](#bullet-22) # # 2.6. [sfcWind](#bullet-22) # # 2.7. [tasrange](#bullet-22) # # 2.8. [tasskew](#bullet-22) # # 2.9. [hurs](#bullet-22) # # 2.10. [rsds](#bullet-22) # # 2.11. [prsnratio](#bullet-22) # # 3. [Summary](#bullet-3) # # # **References:** # # - Lange, S. (2019). Trend-preserving bias adjustment and statistical downscaling with ISIMIP3BASD (v1.0). In Geoscientific Model Development (Vol. 12, Issue 7, pp. 3055–3070). Copernicus GmbH. https://doi.org/10.5194/gmd-12-3055-2019 # # In[1]: import iris import numpy as np import scipy.stats from cf_units import num2date import matplotlib.pyplot as plt # ## 1. ISIMIP3BASD v3.0.1 reference implementation # We pull v3.0.1 of the ISIMIP reference implementation -- the latest version and the version implemented in ibicus and unpack the code: # In[3]: get_ipython().system('wget https://zenodo.org/record/6758997/files/isimip3basd-master.tar.gz -c') get_ipython().system('tar -xf isimip3basd-master.tar.gz') # We can now let the ISIMIP reference code run through. # # The command below is the standard call for ISIMIP debiasing provided `application_example.sh` inside the unpacked `isimip3basd-master.tar.gz`. Three small modifications were made: # - The paths to `bias_adjustment.py` and to the data (in `-o`, `-s`, `-f` and `-b`) were adjusted. # - A flag `--n-quantiles 15000` was added to ensure a high number of quantiles for comparison: this is slower but more exact. With a lower number of quantiles differences between the ibicus implementation and the ISIMIP reference get bigger, due to the ISIMIP linear interpolation being not exact. Those are not directly used, but the number gets reduced to a high isimip default. # - A flag `2> isimip_output.txt` was added to write `stderr` to file. # In[4]: get_ipython().system('python -u isimip3basd-master/code/bias_adjustment.py --n-quantiles 15000 --n-processes 5 --randomization-seed 0 --step-size 1 -v hurs,pr,prsnratio,ps,rlds,rsds,sfcWind,tas,tasrange,tasskew --lower-bound 0,0,0,,,0,0,,0,0 --lower-threshold .01,.0000011574,.0001,,,.0001,.01,,.01,.0001 --upper-bound 100,,1,,,1,,,,1 --upper-threshold 99.99,,.9999,,,.9999,,,,.9999 --distribution ,gamma,,normal,normal,,weibull,normal,weibull, -t bounded,mixed,bounded,additive,additive,bounded,mixed,additive,mixed,bounded --unconditional-ccs-transfer 1,,,,,,,,, --trendless-bound-frequency 1,,,,,,,,, -d ,,,1,1,,,1,, -w 0,0,0,0,0,15,0,0,0,0 --if-all-invalid-use ,,0.,,,,,,, -o isimip3basd-master/data/hurs_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/pr_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/prsnratio_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/ps_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/rlds_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/rsds_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/sfcWind_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/tas_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/tasrange_obs-hist_coarse_1979-2014.nc,isimip3basd-master/data/tasskew_obs-hist_coarse_1979-2014.nc -s isimip3basd-master/data/hurs_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/pr_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/prsnratio_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/ps_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/rlds_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/rsds_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/sfcWind_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/tas_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/tasrange_sim-hist_coarse_1979-2014.nc,isimip3basd-master/data/tasskew_sim-hist_coarse_1979-2014.nc -f isimip3basd-master/data/hurs_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/pr_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/prsnratio_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/ps_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/rlds_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/rsds_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/sfcWind_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/tas_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/tasrange_sim-fut_coarse_2065-2100.nc,isimip3basd-master/data/tasskew_sim-fut_coarse_2065-2100.nc -b isimip3basd-master/data/hurs_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/pr_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/prsnratio_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/ps_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/rlds_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/rsds_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/sfcWind_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/tas_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/tasrange_sim-fut-basd_coarse_2065-2100.nc,isimip3basd-master/data/tasskew_sim-fut-basd_coarse_2065-2100.nc 2> isimip_output.txt') # A number of warnings are generated and written to `isimip_output.txt` indicating that the number of quantiles get reduced to a high ISIMIP default. This is not a problem for the values. # ## 2. ibicus implementation # After having executed the reference implementation we can now come to the ibicus implementation. Let' s import it: # In[5]: from ibicus.debias import ISIMIP # ### 2.1. Helpers # We first define some helpers to read in the ISIMIP-testing and debiased data and gets the dates (last coordinate). Those are needed for the debiasing below. # In[6]: # Given an iris-cube this returns the dates stored in the last time-dimension def get_dates(x): time_dimension = x.coords()[2] dates = time_dimension.units.num2date(time_dimension.points) return dates get_dates = np.vectorize(get_dates) # This reads in the testing-data from ISIMIP stored in isimip3basd-master/data def read_in_and_preprocess_isimip_testing_data_with_dates(variable, isimip_data_path = "isimip3basd-master/data/"): # Load in data obs = iris.load_cube(isimip_data_path+variable+"_obs-hist_coarse_1979-2014.nc") cm_hist = iris.load_cube(isimip_data_path+variable+"_sim-hist_coarse_1979-2014.nc") cm_future = iris.load_cube(isimip_data_path+variable+"_sim-fut_coarse_2065-2100.nc") # Extract dates dates = { "time_obs": get_dates(obs), "time_cm_hist": get_dates(cm_hist), "time_cm_future": get_dates(cm_future) } # Convert to np.array (from masked-array) obs = obs.data cm_hist = cm_hist.data cm_future = cm_future.data # Move time to first axis (our convention) obs = np.moveaxis(obs, -1, 0) cm_hist = np.moveaxis(cm_hist, -1, 0) cm_future = np.moveaxis(cm_future, -1, 0) return obs, cm_hist, cm_future, dates def read_in_debiased_testing_data(variable, isimip_data_path = "isimip3basd-master/data/"): # Load in data debiased_data = iris.load_cube(isimip_data_path+variable+"_sim-fut-basd_coarse_2065-2100.nc") # Move time to first axis (our convention) debiased_data = np.array(debiased_data.data) debiased_data = np.moveaxis(debiased_data, -1, 0) return debiased_data # ### 2.2. tas # Run the ibicus ISIMIP implementation: # In[8]: variable = "tas" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[9]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.isclose(debiased_values,debiased_values_isimip))/debiased_values.size print("Percentage agreement between the ibicus ISIMIP and the reference implementation is %s %%."%(pct_agreement*100)) # We can plot the ibicus and reference implementation over time at one location: # In[10]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP reference implementation") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "Raw CM") plt.legend() plt.show() # We see that the dotted ibicus implementation line lies right above the ISIMIP implementation line. # # We can also plot the values against eachother: # In[11]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # A linear regression also shows that the reference implementation values' and the ibicus ones are consistent: # In[12]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **We conclude that tas is reproduced well by the ibicus implementation of ISIMIP.** # ### 2.3. pr # Run the ibicus ISIMIP implementation: # In[13]: variable = "pr" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[14]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values-debiased_values_isimip) < 1e-6)/debiased_values.size print("%s %% of all values of the ibicus implementation are within 1e-6 distance of the reference implementation."%(pct_agreement*100)) # Plot over time: # In[15]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP reference implementation") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "Raw CM") plt.legend() plt.show() # Values of reference implementation against the ibicus one: # In[16]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # It seems that pr is reproduced quite well and the values of the reference and ibicus implementation are in agreement. This can also be checked through a linear regression: # In[17]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **pr is reproduced well by ibicus. Some slight differences larger than floating point error exist. This is due to:** # # - Randomization: pr includes some randomization between the defined bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping (preceding the parametric one in step 6), which uses linear interpolation, is inexact and differs from the ibicus implementation of nonparametric quantile mapping. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. Similarly the distribution fits in step 6 are just slightly different (within floating point accuracy), meaning that the same values are mapped to slighty different ones (with difference potentially larger than floating point error) when transformed using an (inverse) CDF. # # # ### 2.4. ps / psl # Run the ibicus ISIMIP implementation: # In[18]: variable = "ps" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[19]: debiased_values_isimip = read_in_debiased_testing_data("ps") pct_agreement = np.sum(np.isclose(debiased_values,debiased_values_isimip))/debiased_values.size print("Percentage agreement is %s %% between the ibicus and the reference implementation of ISIMIP."%(pct_agreement*100)) # Plot over time at location [1,1]: # In[20]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against the ibicus one: # In[21]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # A linear regression also shows that the reference implementation values' and the ibicus ones are consistent: # In[22]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **ps / psl is well reproduced by ibicus.** # ### 2.5. rlds # Run the ibicus ISIMIP implementation: # In[23]: variable = "rlds" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[24]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.isclose(debiased_values,debiased_values_isimip))/debiased_values.size print("Percentage agreement is %s %% between the ibicus and the reference implementation of ISIMIP."%(pct_agreement*100)) # We can plot the values over time at location [1,1]: # In[25]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against the ibicus one: # In[26]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # A linear regression also shows that the reference implementation values' and the ibicus ones are consistent: # In[27]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **rlds is well reproduced by ibicus.** # ### 2.6. sfcWind # Run the ibicus ISIMIP implementation: # In[28]: variable = "sfcWind" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[29]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values -debiased_values_isimip) < 1e-3)/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print(f"{pct_agreement*100}% of all values are within 1e-3. There is a maximum deviation of {max_deviation} which is {100*max_deviation/np.mean(debiased_values_isimip)}% of the average value") # We can plot the values over time at location [1,1]: # In[30]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against ibicus ones: # In[31]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # A linear regression also shows that the reference implementation values' and the ibicus ones are consistent: # In[32]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **sfcWind is reproduced well by ibicus. Some differences larger than floating point error exist. This is due to:** # # - Randomization: sfcWind includes some randomization between lower bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping (preceding the parametric one in step 6), which uses linear interpolation, is inexact and differs from the ibicus ones. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. Similarly the distribution fits in step 6 are just slightly different (within floating point accuracy), meaning that the same values are mapped to slighty different ones (with difference potentially larger than floating point error) if transformed with an (inverse) CDF. # # # ### 2.7. tasrange # Run the ibicus ISIMIP implementation: # In[33]: variable = "tasrange" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[34]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values -debiased_values_isimip) < 1e-3)/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print(f"{pct_agreement*100}% of all values are within 1e-3. There is a maximum deviation of {max_deviation} which is {100*max_deviation/np.mean(debiased_values_isimip)}% of the average value") # We can plot the values over time at location [1,1]: # In[35]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against the ibicus ones: # In[36]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # A linear regression also shows that the reference implementation values' and the ibicus ones are consistent: # In[37]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **ibicus can reproduce tasrange well. Some differences larger than floating point error exist. This is due to:** # # - Randomization: tasrange includes some randomization between bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping (preceding the parametric one in step 6), which uses linear interpolation, is inexact and differs from the ibicus ones. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. Similarly the distribution fits in step 6 are just slightly different (within floating point accuracy), meaning that the same values are mapped to slighty different ones (with difference potentially larger than floating point error) if transformed with an (inverse) CDF. # # # ### 2.8. tasskew # Run the ibicus ISIMIP implementation: # In[38]: variable = "tasskew" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[39]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values -debiased_values_isimip) < 1e-3)/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print(f"{pct_agreement*100}% of all values are within 1e-3. There is a maximum deviation of {max_deviation} which is {100*max_deviation/np.mean(debiased_values_isimip)}% of the average value") # We can plot the values over time at location [1,1]: # In[40]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against ibicus ones: # In[41]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # Linear Regression: # In[42]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **tasskew is reproduced well by ibicus. Some differences larger than floating point error exist. This is due to:** # # - Randomization: tasrange includes some randomization between both upper and lower bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping using linear interpolation is inexact and differs from the ibicus ones. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. # # # ### 2.9. hurs # Run the ibicus ISIMIP implementation: # In[43]: variable = "hurs" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[44]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values -debiased_values_isimip) < 1e-2)/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print(f"{pct_agreement*100}% of all values are within 1e-2 with a maximum deviation of {max_deviation} which is {np.round(100*max_deviation/np.mean(debiased_values_isimip), 4)}% of the average value") # We can plot the values over time at location [1,1]: # In[45]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against ibicus ones: # In[46]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # Linear Regression: # In[47]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **hurs is reproduced well by ibicus. Some differences larger than floating point error exist. This is due to:** # # - Randomization: as other variables hurs includes some randomization between both upper and lower bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping, which uses linear interpolation, is inexact and differs from the ibicus ones. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. # # # ### 2.10. rsds # Run the ibicus ISIMIP implementation: # In[48]: variable = "rsds" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[49]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.abs(debiased_values -debiased_values_isimip) < 0.1)/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print(f"{pct_agreement*100}% of all values are within 0.1 with a maximum deviation of {max_deviation} which is {100*max_deviation/np.mean(debiased_values_isimip)}% of the average value") # We can plot the values over time at location [1,1]: # In[50]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against ibicus ones: # In[51]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # Linear Regression: # In[52]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # **ibicus reproduces rsds well. Some differences larger than floating point error exist. This is due to:** # # - Randomization: rsds includes some randomization between both upper and lower bound and threshold. This can lead to differences. # # - The references implementation of nonparametric quantile mapping, which uses linear interpolation, is inexact and differs from the ibicus ones. This creates some differences. These decrease with the number of quantiles increasing, however they are slightly bigger than floating point error. # # - Accumulation of floating point errors in calculations. Especially floating point errors in the computation of quantiles can lead to slight numerical differences (larger than floating point) if those quantiles are mapped back to values. # # # ### 2.11. prsnratio # Run the ibicus ISIMIP implementation: # In[53]: variable = "prsnratio" obs, cm_hist, cm_future, dates = read_in_and_preprocess_isimip_testing_data_with_dates(variable) debiaser = ISIMIP.from_variable(variable) debiased_values = debiaser.apply(obs, cm_hist, cm_future, **dates) # Compare: # In[54]: debiased_values_isimip = read_in_debiased_testing_data(variable) pct_agreement = np.sum(np.isclose(debiased_values,debiased_values_isimip))/debiased_values.size max_deviation = np.max(np.abs(debiased_values-debiased_values_isimip)) print("Percentage agreement is %s with max deviation %s"%(pct_agreement*100, max_deviation)) # We can plot the values over time at location [1,1]: # In[55]: min_time = 0 max_time = 100 lat = 1 lon = 1 time = np.arange(min_time, max_time) plt.plot(time, debiased_values_isimip[min_time:max_time, lat, lon], label = "ISIMIP") plt.plot(time, debiased_values[min_time:max_time, lat, lon], label = "ibicus implementation", linestyle = ":") plt.plot(time, cm_future[min_time:max_time, lat, lon], label = "raw cm") plt.legend() plt.show() # Values of reference implementation against ibicus ones: # In[56]: min_all_values = np.min([np.min(debiased_values), np.min(debiased_values_isimip)]) max_all_values = np.max([np.max(debiased_values), np.max(debiased_values_isimip)]) plt.scatter(debiased_values_isimip.flatten(), debiased_values.flatten()) plt.plot([min_all_values, max_all_values], [min_all_values, max_all_values]) plt.show() # Linear Regression: # In[57]: scipy.stats.linregress(debiased_values_isimip.flatten(), debiased_values.flatten()) # prsnratio includes a large amount of randomization: the metric is not defined on any day without rain (given that it is the ratio of prsn over pr) and in step 2 of the ISIMIP algorithm these values where the metric is not defined are filled with draws from all other available values in a given window (in all three: obs, cm_hist, cm_future). This randomization and random draws are not exactly reproducible using ibicus and they influence all other computed values: the transfer of the climate change signal in step 5, the computation which values are set to bounds and the nonparametric quantile mapping in step 6. However the values of ibicus and the reference implementation nevertheless seem in decent agreement and special care was taken to ensure that the outputs of all steps not including randomzation do agree. # ## 3. Summary # # # The results of the ibicus and reference implementation of ISIMIP agree. Some smaller differences do exist -- at points slightly larger than floating point error, in particular for variables that involve randomization. However, these differences are within the small numerical differences that are to be expected in two different implementations of an algorithm involving a large variety of steps. # # The results above also only serve as last "sanity check" and demonstration for the ibicus implementation of ISIMIP. The authors of the software package took special care that the results do also agree in all intermediate steps. Differences seen in the outputs here were therefore shown to be due to the differences named above (in the variables).