thalassa
(the greek word for "sea") is a library for Large Scale Sea level visualizations of unstructured mesh data.
https://github.com/ec-jrc/Thalassa
%load_ext autoreload
%autoreload 2
from __future__ import annotations
import dask
import geoviews as gv
import holoviews as hv
import numcodecs
import numpy as np
import pandas as pd
import shapely
import xarray as xr
from holoviews import opts as hvopts
from holoviews import streams
from holoviews.streams import PointerXY
from holoviews.streams import Tap
hv.extension("bokeh")
import thalassa
from thalassa import api
from thalassa import normalization
from thalassa import utils
# Set some defaults for the visualization of the graphs
hvopts.defaults(
hvopts.Image(
width=800,
height=600,
show_title=True,
tools=["hover"],
active_tools=["pan", "box_zoom"],
cmap="jet",
),
)
COMPRESSOR = numcodecs.Blosc(cname="zstd", clevel=1)
STOFS-3D-Atl is a 3D model that uses SCHISM.
Output data are split into multiple files. More specifically, the Sea Water elevation is distributed as 3 netcdf files (3-day prediction - one file per day). We will download them for the run of 29th of August 2023.
For more info please check here: https://noaa-nos-stofs3d-pds.s3.amazonaws.com/README.htmlhttps://noaa-nos-stofs3d-pds.s3.amazonaws.com/README.html
%%bash
for day in 28 29 30; do
filename=schout_adcirc_202308"${day}".nc
if [[ ! -f "${filename}" ]]; then
echo Downloading netcdf: "${filename}"
wget --quiet https://noaa-nos-stofs3d-pds.s3.amazonaws.com/STOFS-3D-Atl/stofs_3d_atl.20230829/schout_adcirc_202308"${day}".nc -O "${filename}"
else
echo Netcdf already exists: "${filename}"
fi
done
We want to check the maximum elevation per node for the full 72-hour time span. This means that we need to do some (lightweight) post-processing.
Nevertheless, since we start to do post-processing let's also convert the netcdf files to a zarr archive in order to take advantage of the faster compression algorithms that are supported by zarr.
ds = xr.open_mfdataset("./schout_adcirc_202308??.nc", coords="minimal", data_vars="minimal", compat="override")
ds
# For this example, let's only keep the depth and the zeta variables
ds = ds[["element", "depth", "zeta"]]
#ds["time"] = pd.DatetimeIndex(ds.time)
# Recalculate zeta_max
ds["zeta_max"] = ds.zeta.max("time")
encoding = {var: {"compressor": COMPRESSOR} for var in ds}
# The default Zarr compression seems to mangle the timestamps. Using ZSTD seems to work fine though.
encoding["time"] = {"compressor": COMPRESSOR}
ds = ds.chunk(dict(time=1, node=200_000))
# Save as a zarr archive in the same directory
store = ds.to_zarr("schout_adcirc_20230829.zarr", mode="w", consolidated=True, encoding=encoding)
!du -h *.nc
!du -hd0 *.zarr
Thalassa supports multiple solvers. Currently supported ones include:
In order to support these solvers, thalassa converts their output to what we call the "thalassa schema".
To make a long story short, we just need to use the thalassa.open_dataset()
function.
import thalassa
ds = thalassa.open_dataset("schout_adcirc_20230829.zarr")
ds
import thalassa
thalassa.plot(ds, "depth")
thalassa.plot(
ds=ds,
variable="zeta_max",
clim_min=0.5,
clim_max=3,
clabel="meters",
title="Custom title for 'zeta_max'"
)
timestamp = pd.Timestamp(ds.time[16].values)
thalassa.plot(
ds=ds.sel(time=timestamp), # or `.isel() etc
variable="zeta",
clim_max=1,
title=f"zeta: {timestamp}",
)
thalassa.plot_mesh(ds)
If we have a specific RoI, we could crop the dataset.
Cropping with a big Bounding Box takes a few seconds, but it is something that only needs to be done once and then everyting is snappier!
bbox = shapely.box(-89, 29.5, -87.5, 31)
bbox
cds = thalassa.crop(ds, bbox)
cds.dims
thalassa.plot(cds, "zeta_max", show_mesh=True)
thalassa.plot_mesh(ds=cds)
# We can also do the cropping on the fly
thalassa.plot(
ds=ds,
variable="depth",
clim_min=-5,
clim_max=20,
bbox=bbox
)