Metocean track comparison¶

Comparing MIKE 21 HD dfsu model result with satellite track observation of surface elevation.

This notebook also includes gridded spatial skill assessments.

In [1]:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib_inline.backend_inline import set_matplotlib_formats
import modelskill as ms
set_matplotlib_formats('png')

Extract track data¶

In [2]:

fn = '../tests/testdata/NorthSeaHD_and_windspeed.dfsu'
mr = ms.model_result(fn, name='HD', item=0)
mr

Out[2]:

<DfsuModelResult> 'HD'

In this case, the track observations are stored in a csv file, which we can read in using pandas. Any file format that can be read into a pandas dataframe can be used here.

In [3]:

fn = '../tests/testdata/altimetry_NorthSea_20171027.csv'
df = pd.read_csv(fn, index_col=0, parse_dates=True)

In [4]:

df.head()

Out[4]:

	lon	lat	surface_elevation	significant_wave_height	wind_speed
date
2017-10-26 04:37:37	8.757272	53.926136	1.6449	0.426	6.100000
2017-10-26 04:37:54	8.221631	54.948459	1.1200	1.634	9.030000
2017-10-26 04:37:55	8.189390	55.008547	1.0882	1.717	9.370000
2017-10-26 04:37:56	8.157065	55.068627	1.0309	1.869	9.559999
2017-10-26 04:37:58	8.124656	55.128700	1.0369	1.939	9.980000

In [5]:

mr.quantity

Out[5]:

Quantity(name='Surface Elevation', unit='m')

In [6]:

# the dataframe doesn't include the metadata on which quantity it contains, we add this manually, consistent with the model result
o1 = ms.TrackObservation(df, item="surface_elevation", name='alti', quantity=ms.Quantity(name="Surface Elevation", unit="meter")) 
o1

C:\Users\jem\Source\modelskill\modelskill\timeseries\_track.py:135: UserWarning: Removed 22 duplicate timestamps with keep=first
  warnings.warn(

Out[6]:

TrackObservation: alti, n=1093

In [7]:

ms.plotting.spatial_overview(o1, mr)

Out[7]:

<Axes: title={'center': 'Spatial coverage'}>

In [8]:

cmp = ms.match(o1, mr)
cmp

Out[8]:

<Comparer>
Quantity: Surface Elevation [meter]
Observation: alti, n_points=532
 Model: HD, rmse=0.115

In [9]:

cmp.data

Out[9]:

<xarray.Dataset>
Dimensions:      (time: 532)
Coordinates:
  * time         (time) datetime64[ns] 2017-10-27T10:45:19 ... 2017-10-29T13:...
    x            (time) float64 1.262 1.231 1.2 1.168 ... 6.908 6.971 7.034
    y            (time) float64 55.3 55.24 55.18 55.13 ... 55.24 55.28 55.32
    z            float64 nan
Data variables:
    Observation  (time) float64 0.3778 0.4375 0.4489 ... 0.8562 0.8368 0.8218
    HD           (time) float32 0.3699 0.356 0.3559 ... 0.7068 0.7068 0.685
Attributes:
    gtype:               track
    modelskill_version:  1.0.dev23
    weight:              1.0
    name:                alti

In [10]:

cmp.plot.scatter();

Extract track from dfs0¶

ModelResult is now a dfs0

In [11]:

fn = '../tests/testdata/NorthSeaHD_extracted_track.dfs0'
mr = ms.TrackModelResult(fn, name='HD', item=2)  # explicitly define type as Track
mr.data

In [12]:

mr

Out[12]:

<TrackModelResult> 'HD' (n_points: 1093)

In [13]:

fn = '../tests/testdata/altimetry_NorthSea_20171027.csv'
df = pd.read_csv(fn, index_col=0, parse_dates=True)
o1 = ms.TrackObservation(df, item=2, name='alti')

C:\Users\jem\Source\modelskill\modelskill\timeseries\_track.py:135: UserWarning: Removed 22 duplicate timestamps with keep=first
  warnings.warn(

In [14]:

o1.data

In [15]:

cmp = ms.compare(o1, mr)

C:\Users\jem\Source\modelskill\modelskill\matching.py:269: FutureWarning: compare is deprecated. Use match instead.
  warnings.warn("compare is deprecated. Use match instead.", FutureWarning)

In [16]:

cmp.plot.scatter();

Gridded skill¶

Load model, load observation, add observation to model and extract.

In [17]:

fn = '../tests/testdata/NorthSeaHD_and_windspeed.dfsu'
mr = ms.model_result(fn, name='HD', item=0)
fn = '../tests/testdata/altimetry_NorthSea_20171027.csv'
df = pd.read_csv(fn, index_col=0, parse_dates=True)
o1 = ms.TrackObservation(df, item=2, name='alti')
cmp = ms.match(o1, mr)
cmp

C:\Users\jem\Source\modelskill\modelskill\timeseries\_track.py:135: UserWarning: Removed 22 duplicate timestamps with keep=first
  warnings.warn(

Out[17]:

<Comparer>
Quantity:  []
Observation: alti, n_points=532
 Model: HD, rmse=0.115

Get metrics binned by a regular spatial grid, returns xarray Dataset

In [18]:

gs = cmp.gridded_skill(metrics=['bias'])

In [19]:

gs['n'].data

Plot using xarray - convenient methods coming soon!

In [20]:

fig, axes = plt.subplots(ncols=2, nrows=1, figsize = (10, 5))
gs.n.plot(ax=axes[0])
gs.bias.plot(ax=axes[1]);

Minimum number of observations¶

In [21]:

gs = cmp.gridded_skill(metrics=['bias'], n_min=25)
fig, axes = plt.subplots(ncols=2, nrows=1, figsize=(10, 5))
gs.n.plot(ax=axes[0])
gs.bias.plot(ax=axes[1]);

Multiple bins - gridded skill for water level categories¶

Get data from comparer as dataframe and add a water level category as a new column.

In [22]:

dftmp = cmp.data.to_dataframe()
dftmp["wl category"] = 'high'
dftmp.loc[dftmp['HD']<0, "wl category"] = 'low'

Add the "wl category" to the comparer's data structure.

In [23]:

cmp.data["wl category"] = dftmp["wl category"]
cmp.data

Now aggregate the data by the new column (and x and y):

In [24]:

gs = cmp.gridded_skill(by=['wl category'], metrics=['bias'], n_min=5)
gs

In [25]:

gs.bias.plot();

Multiple observations¶

Add fake 2nd observation to model

In [26]:

import warnings

df2 = df.copy()
df2['surface_elevation'] = df2['surface_elevation'] - 0.2
o2 = ms.TrackObservation(df2, item=2, name='alti2')

warnings.filterwarnings('ignore', message="duplicate")
cmp2 = ms.match(o2, mr)

C:\Users\jem\Source\modelskill\modelskill\timeseries\_track.py:135: UserWarning: Removed 22 duplicate timestamps with keep=first
  warnings.warn(

Extract, gridded skill, add attrs, plot.

In [27]:

cmp = cmp + cmp2
gs = cmp.gridded_skill(metrics=['bias'], n_min=20)
gs.bias.data.attrs = dict(long_name="Bias of surface elevation", units="m")
gs.bias.plot(figsize=(10,5));

In [ ]: