This notebook will demonstrate the use of ModelSkill on a larger dataset containing more than 9 million satellite track observation points.
Note: requires running the download.ipynb
first!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import modelskill as ms
from matplotlib_inline.backend_inline import set_matplotlib_formats
set_matplotlib_formats('png')
Run the download.ipynb first
fn = '../data/SW_gwm_3a_extracted_2018.dfs0'
mr = ms.model_result(fn, name='GWM', item='Sign. Wave Height', gtype="track")
mr
<TrackModelResult> 'GWM' (n_points: 9141126)
o1 = ms.TrackObservation('../data/altimetry_3a_2018_filter1.dfs0', item=2, name='3a')
cmp = ms.match(o1, mr)
cmp.sel(end='2018-1-15').skill()
n | bias | rmse | urmse | mae | cc | si | r2 | |
---|---|---|---|---|---|---|---|---|
observation | ||||||||
3a | 372356 | -0.475229 | 0.633093 | 0.418287 | 0.510757 | 0.940399 | 0.124015 | 0.72003 |
cmp.skill()
n | bias | rmse | urmse | mae | cc | si | r2 | |
---|---|---|---|---|---|---|---|---|
observation | ||||||||
3a | 9105388 | -0.489389 | 0.646012 | 0.421699 | 0.520279 | 0.943032 | 0.122967 | 0.720777 |
Gridded skill with 1 deg bins and default bin edges.
gs = cmp.gridded_skill(metrics=['bias'], bins=(np.arange(-180,180,1), np.arange(-90,90,1)), n_min=20)
Add attrs and plot
type(gs)
modelskill.skill_grid.SkillGrid
gs.bias.data.attrs = dict(long_name="Bias of significant wave height, Hm0",units="m")
gs.n.data.attrs = dict(long_name="N of significant wave height",units="-")
fig, axes = plt.subplots(ncols=1, nrows=2, figsize = (8, 10))
gs.n.plot(ax=axes[0])
gs.bias.plot(ax=axes[1]);
Use all_df to obtain and df argument to pass customized data back to comparer.
all_df = cmp.data.to_dataframe()
mean_val = all_df[['GWM','Observation']].mean(axis=1)
all_df['val_cat'] = pd.cut(mean_val, [0,2,5,np.inf], labels=["Hm0[m]=[0, 2)","Hm0[m]=[2, 5)","Hm0[m]=[5, inf)"])
all_df.head()
x | y | Observation | z | GWM | val_cat | |
---|---|---|---|---|---|---|
time | ||||||
2018-01-01 00:00:00 | -33.706020 | 23.181158 | 2.611 | NaN | 2.292599 | Hm0[m]=[2, 5) |
2018-01-01 00:00:01 | -33.720741 | 23.240074 | 2.608 | NaN | 2.292612 | Hm0[m]=[2, 5) |
2018-01-01 00:00:02 | -33.735474 | 23.298990 | 2.518 | NaN | 2.292624 | Hm0[m]=[2, 5) |
2018-01-01 00:00:03 | -33.750214 | 23.357904 | 2.729 | NaN | 2.292637 | Hm0[m]=[2, 5) |
2018-01-01 00:00:04 | -33.764965 | 23.416819 | 2.593 | NaN | 2.292650 | Hm0[m]=[2, 5) |
cmp.data["val_cat"] = all_df["val_cat"]
cmp.data
<xarray.Dataset> Dimensions: (time: 9105388) Coordinates: * time (time) datetime64[ns] 2018-01-01 ... 2018-12-30T23:59:59 x (time) float64 -33.71 -33.72 -33.74 ... 153.7 153.7 153.7 y (time) float64 23.18 23.24 23.3 23.36 ... 23.19 23.13 23.07 z float64 nan Data variables: Observation (time) float64 2.611 2.608 2.518 2.729 ... 3.572 3.505 3.364 GWM (time) float64 2.293 2.293 2.293 2.293 ... 2.693 2.693 2.693 val_cat (time) object 'Hm0[m]=[2, 5)' ... 'Hm0[m]=[2, 5)' Attributes: gtype: track modelskill_version: 1.0.dev23 weight: 1.0 name: 3a
gs = cmp.gridded_skill(by=["val_cat"], metrics=["bias"], bins=(np.arange(-180,180,5), np.arange(-90,90,5)), n_min=20)
gs.data['bias'].attrs = dict(long_name="Bias of significant wave height, Hm0", units="m")
gs.data['n'].attrs = dict(long_name="N of significant wave height", units="-")
gs.data['val_cat'].attrs = dict(long_name="Range of sign. wave height, Hm0", units="m")
gs.n.plot(figsize=(12,4));
gs.bias.plot(figsize=(12,4));