Notebook

CMIP6 without the interpolation: Grid-native analysis with Pangeo in the cloud¶

Julius Busecke¹ and Ryan Abernathey²

(¹ Princeton University ² Columbia University)

The coupled model intercomparison project (CMIP; currently in its 6th iteration) involves contribution from countries around the world, constituting a combined dataset of of 20+ Petabytes.

Many years of planning go into the inception of the different scenarios, various model intercomparison projects (MIPs) and the delivery of the output data from the modeling community around the globe. A major challenge for the earth science community now, is to effectively analyze this dataset and answer science question that improve our understanding of the earth system for the coming years.

The Pangeo project recently introduced large parts of the CMIP6 data archive into the cloud. This enables, for the first time, centralized, reproducible science of state-of-the-art climate simulations without the need to own large storage or a supercomputer as a user.

The data itself however, still presents significant challenges for analysis, one of which is applying operations across many models. Two of the major hurdles are different naming/metadata between modeling centers and complex grid layouts, particularly for the ocean components of climate models. Common workflows in the past often included interpolating/remapping desired variables and creating new files, creating organizational burden, and increasing storage requirements.

We will demonstrate two pangeo tools which enable seamless calculation of common operations like vector calculus operators (grad, curl, div) and weighted averages/integrals across a wide range of CMIP6 models directly on the data stored in the cloud. cmip6_preprocessing provides numerous tools to unify naming conventions and reconstruct grid information and metrics (like distances). This information is used by xgcm to enable finite volume analysis on the native model grids. The combination of both tools facilitates fast analysis while ensuring a reproducible and accurate workflow.

In [1]:

from dask.distributed import Client, progress
from dask_gateway import Gateway

gateway = Gateway()
cluster = gateway.new_cluster()
cluster.scale(60)
cluster

VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\n<div>\n<style scoped>\n    …

In [2]:

client = Client(cluster)
client

Out[2]:

Client

Scheduler: gateway://traefik-ocean-prod-dask-gateway.ocean-prod:80/ocean-prod.b8379c8640314c0ea8614c0921483fab
Dashboard: https://ocean.pangeo.io/services/dask-gateway/clusters/ocean-prod.b8379c8640314c0ea8614c0921483fab/status

Cluster

Workers: 0
Cores: 0
Memory: 0 B

Loading the CMIP6 data from the cloud¶

In order to load the cmip data from the cloud storage, we use intake-esm, and choose to load all monthly ocean model output (table_id='Omon') of sea surface temperature (variable_id='tos') on the native grid (grid_label='gn') and for the 'historical' run (experiment_id='historical'). For more details on how to use intake-esm we refer to this Earth Cube presentation.

In [3]:

import intake
import warnings
warnings.filterwarnings("ignore")

url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
models = col.df['source_id'].unique()
# we will have to eliminate some models that do not *yet* play nice with cmip6_preprocessing
# ignore_models = ['FGOALS-f3-L','EC-Earth3','EC-Earth3-Veg-LR', 'CESM2-FV2',
#                  'GFDL-ESM4', 'MPI-ESM-1-2-HAM', 'MPI-ESM1-2-LR', 'NorCPM1',
#                  'GISS-E2-1-H', 'IPSL-CM6A-LR', 'MRI-ESM2-0', 'CESM2-WACCM',
#                  'GISS-E2-1-G', 'FIO-ESM-2-0', 'MCM-UA-1-0', 'FGOALS-g3']
# models = [m for m in models if 'AWI' not in m and m not in ignore_models]

# These might change in the future so it is better to explicitly allow the models that work

models = ['TaiESM1','BCC-ESM1','CNRM-ESM2-1','MIROC6','UKESM1-0-LL','NorESM2-LM',
          'BCC-CSM2-MR','CanESM5','ACCESS-ESM1-5','E3SM-1-1-ECA','E3SM-1-1',
          'MIROC-ES2L','CESM2','CNRM-CM6-1','GFDL-CM4','CAMS-CSM1-0','CAS-ESM2-0',
          'CanESM5-CanOE','IITM-ESM','CNRM-CM6-1-HR','ACCESS-CM2','E3SM-1-0',
          'EC-Earth3-LR','EC-Earth3-Veg','INM-CM4-8','INM-CM5-0','HadGEM3-GC31-LL',
          'HadGEM3-GC31-MM','MPI-ESM1-2-HR','GISS-E2-1-G-CC','GISS-E2-2-G','CESM2-WACCM-FV2',
          'NorESM1-F','NorESM2-MM','KACE-1-0-G','GFDL-AM4','NESM3',
          'SAM0-UNICON','CIESM','CESM1-1-CAM5-CMIP5','CMCC-CM2-HR4','CMCC-CM2-VHR4',
          'EC-Earth3P-HR','EC-Earth3P','ECMWF-IFS-HR','ECMWF-IFS-LR','INM-CM5-H',
          'IPSL-CM6A-ATM-HR','HadGEM3-GC31-HM','HadGEM3-GC31-LM','MPI-ESM1-2-XR','MRI-AGCM3-2-H',
          'MRI-AGCM3-2-S','GFDL-CM4C192','GFDL-OM4p5B']

cat = col.search(table_id='Omon', grid_label='gn', experiment_id='historical', variable_id='tos', source_id=models)

/srv/conda/envs/notebook/lib/python3.7/site-packages/intake/source/discovery.py:138: FutureWarning: The drivers ['geojson', 'postgis', 'shapefile', 'spatialite'] do not specify entry_points and were only discovered via a package scan. This may break in a future release of intake. The packages should be updated.
  FutureWarning)

We have to eliminate some models from this analysis: The AWI models, due to their unstructured native grid, as well as others for various subtleties that have yet to be resolved. This will be addressed in a future update of cmip6_preprocessing. If you do not see your favorite model, please consider raising an issue on github.

In [4]:

cat.df['source_id'].nunique() # This represents the number of models as of 6/14/2020.
                              # More models might be added in the future. See commented parts in previous code cell.

Out[4]:

This gives us 27 different models ('source_ids') to work with. Lets load them into a dictionary and inspect them closer.

In [5]:

# Fix described here: https://github.com/intake/filesystem_spec/issues/317
# Would cause `Name (gs) already in the registry and clobber is False` error somewhere in `intake-esm`
import fsspec
fsspec.filesystem('gs')

ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True, 'decode_times':False})

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'

100.00% [27/27 00:09<00:00]

What we ultimately want is to apply an analysis or just a visualization across all these models. So before we jump into that, lets inspect a few of the datasets:

In [6]:

ddict['CMIP.NCAR.CESM2.historical.Omon.gn']

Out[6]:

xarray.Dataset

Dimensions:
- d2: 2
- member_id: 11
- nlat: 384
- nlon: 320
- time: 3960
- vertices: 4

Coordinates: (9)

lon

(nlat, nlon)

float64

dask.array<chunksize=(384, 320), meta=np.ndarray>

axis :: X
bounds :: lon_bnds
standard_name :: longitude
title :: Longitude
type :: double
units :: degrees_east
valid_max :: 360.0
valid_min :: 0.0





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   983.04 kB 
     Shape  (384, 320)   (384, 320) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time_bnds

(time, d2)

float64

dask.array<chunksize=(3960, 2), meta=np.ndarray>

calendar :: noleap
units :: hours since 1850-01-01T02:00:00





  
      Array  Chunk 
  
  
     Bytes  63.36 kB   63.36 kB 
     Shape  (3960, 2)   (3960, 2) 
     Count  10 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lon_bnds

(nlat, nlon, vertices)

float32

dask.array<chunksize=(384, 320, 4), meta=np.ndarray>

units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  1.97 MB   1.97 MB 
     Shape  (384, 320, 4)   (384, 320, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lat

(nlat, nlon)

float64

dask.array<chunksize=(384, 320), meta=np.ndarray>

axis :: Y
bounds :: lat_bnds
standard_name :: latitude
title :: Latitude
type :: double
units :: degrees_north
valid_max :: 90.0
valid_min :: -90.0





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   983.04 kB 
     Shape  (384, 320)   (384, 320) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_bnds

(nlat, nlon, vertices)

float32

dask.array<chunksize=(384, 320, 4), meta=np.ndarray>

units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  1.97 MB   1.97 MB 
     Shape  (384, 320, 4)   (384, 320, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

time
(time)
float64
0.0 707.0 ... 1.444e+06 1.445e+06
axis :
T
bounds :
time_bnds
calendar :
noleap
standard_name :
time
title :
time
type :
double
units :
days since 0001-01-01
```
array([0.000000e+00, 7.070000e+02, 1.415000e+03, ..., 1.443191e+06,
       1.443923e+06, 1.444655e+06])
```
nlat
(nlat)
int32
1 2 3 4 5 6 ... 380 381 382 383 384
long_name :
cell index along second dimension
units :
1
```
array([  1,   2,   3, ..., 382, 383, 384], dtype=int32)
```
nlon
(nlon)
int32
1 2 3 4 5 6 ... 316 317 318 319 320
long_name :
cell index along first dimension
units :
1
```
array([  1,   2,   3, ..., 318, 319, 320], dtype=int32)
```

member_id

(member_id)

<U9

'r10i1p1f1' ... 'r9i1p1f1'

array(['r10i1p1f1', 'r11i1p1f1', 'r1i1p1f1', 'r2i1p1f1', 'r3i1p1f1',
       'r4i1p1f1', 'r5i1p1f1', 'r6i1p1f1', 'r7i1p1f1', 'r8i1p1f1', 'r9i1p1f1'],
      dtype='<U9')

Data variables: (1)

tos

(member_id, time, nlat, nlon)

float32

dask.array<chunksize=(1, 400, 384, 320), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
comment :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
frequency :: mon
id :: tos
long_name :: Sea Surface Temperature
mipTable :: Omon
out_name :: tos
prov :: Omon ((isd.003))
realm :: ocean
standard_name :: sea_surface_temperature
time :: time
time_label :: time-mean
time_title :: Temporal mean
title :: Sea Surface Temperature
type :: real
units :: degC
variable_id :: tos





  
      Array  Chunk 
  
  
     Bytes  21.41 GB   196.61 MB 
     Shape  (11, 3960, 384, 320)   (1, 400, 384, 320) 
     Count  1444 Tasks  319 Chunks 
     Type  float32  numpy.ndarray

Attributes: (48)
institution :
National Center for Atmospheric Research, Climate and Global Dynamics Laboratory, 1850 Table Mesa Drive, Boulder, CO 80305, USA
history :
2019-08-02 08:55:11.664415 xarray.merge(<ALL_VARIABLES>)
nominal_resolution :
100 km
variable_id :
tos
cesm_casename :
b.e21.BHIST.f09_g17.CMIP6-historical.009
activity_id :
CMIP
variant_label :
r9i1p1f1
parent_variant_label :
r1i1p1f1
initialization_index :
1
status :
2019-10-25;created;by nhn2@columbia.edu
product :
model-output
model_doi_url :
https://doi.org/10.5065/D67H1H0V
source_type :
AOGCM BGC
tracking_id :
hdl:21.14100/5ac93391-d2b6-410a-8c32-27ac884c2d8d hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/174ee4b6-1e86-4062-a7af-317128c6a93b hdl:21.14100/e40efc72-6d42-465b-b305-54f19ef6ed1e hdl:21.14100/e5e905c2-7d89-46ec-83b2-400baaedca98 hdl:21.14100/b6db75cc-8b70-49a3-b115-d8ef87c3c263 hdl:21.14100/16619653-adf9-49ed-bf06-f945e1817fa0 hdl:21.14100/11ced3e0-5db9-440e-b2e7-7601e094e7d1 hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a
source :
CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); ocean: POP2 (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m); sea_ice: CICE5.1 (same grid as ocean); land: CLM5 0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); aerosol: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); atmoschem: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); landIce: CISM2.1; ocnBgchem: MARBL (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m)
experiment_id :
historical
forcing_index :
1
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
674885.0
grid :
native gx1v7 displaced pole grid (384x320 latxlon)
mip_era :
CMIP6
case_id :
23
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r9i1p1f1
intake_esm_varname :
tos
table_id :
Omon
parent_source_id :
CESM2
realization_index :
9
variant_info :
CMIP6 20th century experiments (1850-2014) with CAM6, interactive land (CLM5), coupled ocean (POP2) with biogeochemistry (MARBL), interactive sea ice (CICE5.1), and non-evolving land ice (CISM2.1)
license :
CMIP6 model data produced by <The National Center for Atmospheric Research> is licensed under a Creative Commons Attribution-[]ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file)[]. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
grid_label :
gn
contact :
cesm_cmip6@ucar.edu
external_variables :
areacello
institution_id :
NCAR
branch_method :
standard
source_id :
CESM2
parent_time_units :
days since 0001-01-01 00:00:00
Conventions :
CF-1.7 CMIP-6.2
branch_time_in_parent :
295650.0
data_specs_version :
01.00.29
realm :
ocean
sub_experiment_id :
none
frequency :
mon
physics_index :
1
intake_esm_dataset_key :
CMIP.NCAR.CESM2.historical.Omon.gn

In [7]:

ddict['CMIP.CNRM-CERFACS.CNRM-CM6-1.historical.Omon.gn']

Out[7]:

xarray.Dataset

Dimensions:
- axis_nbounds: 2
- member_id: 30
- nvertex: 4
- time: 3959
- x: 362
- y: 294

Coordinates: (7)

bounds_lat

(y, x, nvertex)

float64

dask.array<chunksize=(294, 362, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  3.41 MB   3.41 MB 
     Shape  (294, 362, 4)   (294, 362, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lon

(y, x)

float64

dask.array<chunksize=(294, 362), meta=np.ndarray>

bounds :: bounds_lon
long_name :: Longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  851.42 kB   851.42 kB 
     Shape  (294, 362)   (294, 362) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time_bounds

(time, axis_nbounds)

float64

dask.array<chunksize=(3959, 2), meta=np.ndarray>

calendar :: gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  63.34 kB   63.34 kB 
     Shape  (3959, 2)   (3959, 2) 
     Count  10 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float64

dask.array<chunksize=(294, 362), meta=np.ndarray>

bounds :: bounds_lat
long_name :: Latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  851.42 kB   851.42 kB 
     Shape  (294, 362)   (294, 362) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

bounds_lon

(y, x, nvertex)

float64

dask.array<chunksize=(294, 362, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  3.41 MB   3.41 MB 
     Shape  (294, 362, 4)   (294, 362, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time
(time)
float64
0.0 15.5 ... 1.445e+06 1.446e+06
axis :
T
bounds :
time_bounds
calendar :
gregorian
long_name :
Time axis
standard_name :
time
time_origin :
1850-01-01 00:00:00
units :
days since 1850-01-01
```
array([0.000000e+00, 1.550000e+01, 4.500000e+01, ..., 1.444152e+06,
       1.444884e+06, 1.445616e+06])
```

member_id

(member_id)

<U9

'r10i1p1f2' ... 'r9i1p1f2'

array(['r10i1p1f2', 'r11i1p1f2', 'r12i1p1f2', 'r13i1p1f2', 'r14i1p1f2',
       'r15i1p1f2', 'r16i1p1f2', 'r17i1p1f2', 'r18i1p1f2', 'r19i1p1f2',
       'r1i1p1f2', 'r20i1p1f2', 'r21i1p1f2', 'r22i1p1f2', 'r23i1p1f2',
       'r24i1p1f2', 'r25i1p1f2', 'r26i1p1f2', 'r27i1p1f2', 'r28i1p1f2',
       'r29i1p1f2', 'r2i1p1f2', 'r30i1p1f2', 'r3i1p1f2', 'r4i1p1f2',
       'r5i1p1f2', 'r6i1p1f2', 'r7i1p1f2', 'r8i1p1f2', 'r9i1p1f2'], dtype='<U9')

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 126, 294, 362), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
history :: none
interval_operation :: 1800 s
interval_write :: 1 month
long_name :: Sea Surface Temperature
online_operation :: average
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  50.56 GB   105.58 MB 
     Shape  (30, 3959, 294, 362)   (1, 248, 294, 362) 
     Count  4205 Tasks  930 Chunks 
     Type  float32  numpy.ndarray

Attributes: (51)
arpege_minor_version :
6.3.2
institution :
CNRM (Centre National de Recherches Meteorologiques, Toulouse 31057, France), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, Toulouse 31057, France)
history :
none Mon Jul 30 13:06:24 2018: ncatted -O -a tracking_id,global,m,c,hdl:21.14100/a3d87073-3388-4f02-a543-7879d04fa084 /scratch/work/voldoire/outputs/CMIP6/DECK/CNRM-CM6-1_historical_r1i1p1f2/assembled/tos_Omon_CNRM-CM6-1_historical_r1i1p1f2_gn_185001-201412.nc none none none none none none none none none none none none none none none none none none none none
nominal_resolution :
100 km
variable_id :
tos
activity_id :
CMIP
parent_variant_label :
r1i1p1f2
initialization_index :
1
status :
2019-11-11;created;by nhn2@columbia.edu
product :
model-output
references :
http://www.umr-cnrm.fr/cmip6/references
source_type :
AOGCM
dr2xml_md5sum :
92ddb3d0d8ce79f498d792fc8e559dcf
tracking_id :
hdl:21.14100/815bde79-bb52-48db-8104-8d4d52df5d0d hdl:21.14100/fdb5906a-908a-40bc-a332-45c7b2d5261e hdl:21.14100/536eff29-fec0-4553-b4b4-69ade4ca3cfd hdl:21.14100/62ad0ad9-1714-4f19-8806-c092effa780a hdl:21.14100/f3cd7b9f-3405-4026-9d19-de1750d3f69b hdl:21.14100/4eced8f9-fc97-4452-900f-b52371787b56 hdl:21.14100/69a274f8-71ec-4402-9086-f5b907f559e1 hdl:21.14100/1d31fc11-3770-409c-b385-dc78d37edf2e hdl:21.14100/f2c7a58d-0e95-4b71-9df3-d37aa7e9438e hdl:21.14100/e185742d-f1df-4bb0-b3e1-e0875a8fa3af hdl:21.14100/a3d87073-3388-4f02-a543-7879d04fa084 hdl:21.14100/4b9db366-1ff6-4ebe-8eb3-0a263bdaf7ad hdl:21.14100/b7614710-9f8d-4ef1-a1bc-841a67e1dd26 hdl:21.14100/6ced7db7-121c-4667-a40b-1ebd15741fb3 hdl:21.14100/992c36aa-a632-4ba9-8f5f-9beacbf0baaf hdl:21.14100/d015b520-96fa-4344-9116-3e8523314e2c hdl:21.14100/c94aa9d6-4126-4640-855f-cede3b705dcc hdl:21.14100/b69749d4-e709-4614-a718-2ff30a7f050f hdl:21.14100/7a70fb0f-da04-41ca-95b2-d87507bd1612 hdl:21.14100/bb3070b0-0e32-4a98-96be-faf5370a08bc hdl:21.14100/d7bbc018-8ebf-43e7-84a0-864c8ea41bf0 hdl:21.14100/c4e8f746-8dae-4fa7-ab5f-de7cbc4e39f5 hdl:21.14100/27e01d3a-b20f-4748-a7aa-e0a32e21d413 hdl:21.14100/b7114b56-47a2-4518-a72e-3697c732e45e hdl:21.14100/229d0004-efe1-4f25-8acd-72d98c421a67 hdl:21.14100/1fc6bdd4-2471-4210-9393-13e6236f43be hdl:21.14100/a7426751-501c-4e86-bcfa-5080e46110d7 hdl:21.14100/b2e51d20-e68b-48b5-81ae-cae790c71f76 hdl:21.14100/9e5f5f34-da9e-4674-9707-bba1b72e3e7c hdl:21.14100/443df451-a353-4689-aa8f-0d957452695e
source :
CNRM-CM6-1 (2017): aerosol: prescribed monthly fields computed by TACTIC_v2 scheme atmos: Arpege 6.3 (T127; Gaussian Reduced with 24572 grid points in total distributed over 128 latitude circles (with 256 grid points per latitude circle between 30degN and 30degS reducing to 20 grid points per latitude circle at 88.9degN and 88.9degS); 91 levels; top level 78.4 km) atmosChem: OZL_v2 land: Surfex 8.0c ocean: Nemo 3.6 (eORCA1, tripolar primarily 1deg; 362 x 294 longitude/latitude; 75 levels; top grid cell 0-1 m) seaIce: Gelato 6.1
experiment_id :
historical
forcing_index :
2
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
nemo_gelato_commit :
49095b3accd5d4c_6524fe19b00467a
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
0.0
grid :
native ocean tri-polar grid with 105 k ocean cells
mip_era :
CMIP6
xios_commit :
1442-shuffle
name :
/scratch/work/voldoire/outputs/CMIP6/DECK/CNRM-CM6-1_historical_r1i1p1f2/18500101/tos_Omon_CNRM-CM6-1_historical_r1i1p1f2_gn_%start_date%-%end_date%
intake_esm_varname :
tos
table_id :
Omon
parent_source_id :
CNRM-CM6-1
NCO :
"4.5.5"
license :
CMIP6 model data produced by CNRM-CERFACS is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at http://www.umr-cnrm.fr/cmip6/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
title :
CNRM-CM6-1 model output prepared for CMIP6 / CMIP historical
grid_label :
gn
contact :
contact.cmip@meteo.fr
external_variables :
areacello
institution_id :
CNRM-CERFACS
branch_method :
standard
source_id :
CNRM-CM6-1
parent_time_units :
days since 1850-01-01 00:00:00
dr2xml_version :
1.13
Conventions :
CF-1.7 CMIP-6.2
CMIP6_CV_version :
cv=6.2.3.0-7-g2019642
data_specs_version :
01.00.21
realm :
ocean
sub_experiment_id :
none
description :
CMIP6 historical
frequency :
mon
physics_index :
1
intake_esm_dataset_key :
CMIP.CNRM-CERFACS.CNRM-CM6-1.historical.Omon.gn

In [8]:

ddict['CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn']

Out[8]:

xarray.Dataset

Dimensions:
- bnds: 2
- i: 360
- j: 300
- member_id: 3
- time: 1980
- vertices: 4

Coordinates: (7)

time_bnds

(time, bnds)

float64

dask.array<chunksize=(1980, 2), meta=np.ndarray>

calendar :: proleptic_gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  31.68 kB   31.68 kB 
     Shape  (1980, 2)   (1980, 2) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

latitude

(j, i)

float64

dask.array<chunksize=(300, 360), meta=np.ndarray>

bounds :: vertices_latitude
long_name :: latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  864.00 kB   864.00 kB 
     Shape  (300, 360)   (300, 360) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

longitude

(j, i)

float64

dask.array<chunksize=(300, 360), meta=np.ndarray>

bounds :: vertices_longitude
long_name :: longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  864.00 kB   864.00 kB 
     Shape  (300, 360)   (300, 360) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time
(time)
int64
0 708 1416 ... 1444884 1445616
axis :
T
bounds :
time_bnds
calendar :
proleptic_gregorian
long_name :
time
standard_name :
time
units :
hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
i
(i)
int32
0 1 2 3 4 5 ... 355 356 357 358 359
long_name :
cell index along first dimension
units :
1
```
array([  0,   1,   2, ..., 357, 358, 359], dtype=int32)
```
j
(j)
int32
0 1 2 3 4 5 ... 295 296 297 298 299
long_name :
cell index along second dimension
units :
1
```
array([  0,   1,   2, ..., 297, 298, 299], dtype=int32)
```

member_id

(member_id)

<U8

'r1i1p1f1' 'r2i1p1f1' 'r3i1p1f1'

array(['r1i1p1f1', 'r2i1p1f1', 'r3i1p1f1'], dtype='<U8')

Data variables: (3)

vertices_longitude

(j, i, vertices)

float64

dask.array<chunksize=(300, 360, 4), meta=np.ndarray>

units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  3.46 MB   3.46 MB 
     Shape  (300, 360, 4)   (300, 360, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

vertices_latitude

(j, i, vertices)

float64

dask.array<chunksize=(300, 360, 4), meta=np.ndarray>

units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  3.46 MB   3.46 MB 
     Shape  (300, 360, 4)   (300, 360, 4) 
     Count  2 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

tos

(member_id, time, j, i)

float32

dask.array<chunksize=(1, 201, 300, 360), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
comment :: Temperature of upper boundary of the liquid ocean, including temperatures below sea-ice and floating ice shelves.
history :: 2019-11-15T15:21:40Z altered by CMOR: replaced missing value flag (-1e+20) with standard missing value (1e+20).
long_name :: Sea Surface Temperature
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  2.57 GB   86.83 MB 
     Shape  (3, 1980, 300, 360)   (1, 201, 300, 360) 
     Count  93 Tasks  30 Chunks 
     Type  float32  numpy.ndarray

Attributes: (50)
institution :
Commonwealth Scientific and Industrial Research Organisation, Aspendale, Victoria 3195, Australia
history :
2019-11-15T15:21:42Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards. 2019-11-28T13:20:42Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards. 2019-12-03T10:07:56Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.
nominal_resolution :
250 km
variable_id :
tos
cmor_version :
3.4.0
activity_id :
CMIP
variant_label :
r3i1p1f1
parent_variant_label :
r1i1p1f1
initialization_index :
1
table_info :
Creation Date:(30 April 2019) MD5:e14f55f257cceafb2523e41244962371
status :
2020-05-09;created; by gcs.cmip6.ldeo@gmail.com
product :
model-output
source_type :
AOGCM
creation_date :
2019-12-03T10:07:56Z
tracking_id :
hdl:21.14100/02850fcc-be64-40de-b7ca-9b8aa6e688a0 hdl:21.14100/999d23f9-89c9-4208-888c-fd91ae8a5ed3 hdl:21.14100/12f6dd76-c67f-4cb7-b6fe-ab080d29ace2
source :
ACCESS-ESM1.5 (2019): aerosol: CLASSIC (v1.0) atmos: HadGAM2 (r1.1, N96; 192 x 145 longitude/latitude; 38 levels; top level 39255 m) atmosChem: none land: CABLE2.4 landIce: none ocean: ACCESS-OM2 (MOM5, tripolar primarily 1deg; 360 x 300 longitude/latitude; 50 levels; top grid cell 0-10 m) ocnBgchem: WOMBAT (same grid as ocean) seaIce: CICE4.1 (same grid as ocean)
experiment_id :
historical
forcing_index :
1
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
0.0
grid :
native atmosphere N96 grid (145x192 latxlon)
mip_era :
CMIP6
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.CSIRO.ACCESS-ESM1-5.historical.none.r3i1p1f1
intake_esm_varname :
tos
version :
v20191203
table_id :
Omon
parent_source_id :
ACCESS-ESM1-5
notes :
Exp: ESM-historical; Local ID: HI-07; Variable: tos (['sst'])
realization_index :
3
license :
CMIP6 model data produced by CSIRO is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
title :
ACCESS-ESM1-5 output prepared for CMIP6
grid_label :
gn
external_variables :
areacello
institution_id :
CSIRO
branch_method :
standard
source_id :
ACCESS-ESM1-5
parent_time_units :
days since 0101-1-1
Conventions :
CF-1.7 CMIP-6.2
branch_time_in_parent :
36524.0
data_specs_version :
01.00.30
realm :
ocean
sub_experiment_id :
none
frequency :
mon
physics_index :
1
run_variant :
forcing: GHG, Oz, SA, Sl, Vl, BC, OC, (GHG = CO2, N2O, CH4, CFC11, CFC12, CFC113, HCFC22, HFC125, HFC134a)
intake_esm_dataset_key :
CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn

We can immediately spot problems:

There is no consistent convention for the labeling of dimensions (note the 'logical 1D dimension in the x-direction is called x, nlon, i)
Some models (here: CMIP.CNRM-CERFACS.CNRM-CM6-1.historical.Omon.gn) are missing the 'vertex'/'vertices' dimension, due to the fact that the cell geometry is given by longitude and latitude 'bounds' centered on the cell face compared to the corners of the cell.

There are more issues that both make working with the data cumbersome and can quickly introduce errors. For instance, some models give depth in units of cm, not m, and many other issues. Instead of focusing on problems, here we would like to illustrate how easy analysis across models with different grids can be, using cmip6_preprocessing.

Use `cmip6_preprocessing` to harmonize different model output¶

cmip6_preprocessing was born out of the cmip6-hackathon and aims to provide a central package through which these conventions can be harmonized. For convenience we will make use of the preprocess functionality and apply the 'catch-all' function combined_preprocessing to the data before it gets aggregated. For a more detailed description of the corrections applied we refer to the documentation and examples therein.

In [9]:

from cmip6_preprocessing.preprocessing import combined_preprocessing

ddict_pp = cat.to_dataset_dict(
    zarr_kwargs={'consolidated':True, 'decode_times':False},
    preprocess=combined_preprocessing
)

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'

100.00% [27/27 00:50<00:00]

Note that all functions in the cmip6_preprocessing.preprocessing modules can also be used with 'raw' datasets, like e.g. a local netcdf file, as long as the metadata is intact.

Now lets look at the preprocessed version of the same datasets from above:

In [10]:

ddict_pp['CMIP.NCAR.CESM2.historical.Omon.gn']

Out[10]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 11
- time: 3960
- vertex: 4
- x: 320
- y: 384

Coordinates: (13)

lon

(y, x)

float64

dask.array<chunksize=(384, 320), meta=np.ndarray>

axis :: X
bounds :: lon_bnds
standard_name :: longitude
title :: Longitude
type :: double
units :: degrees_east
valid_max :: 360.0
valid_min :: 0.0





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   983.04 kB 
     Shape  (384, 320)   (384, 320) 
     Count  13 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_verticies

(y, x, vertex)

float32

dask.array<chunksize=(384, 320, 4), meta=np.ndarray>

units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  1.97 MB   1.97 MB 
     Shape  (384, 320, 4)   (384, 320, 4) 
     Count  8 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lon_verticies

(y, x, vertex)

float32

dask.array<chunksize=(384, 320, 4), meta=np.ndarray>

units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  1.97 MB   1.97 MB 
     Shape  (384, 320, 4)   (384, 320, 4) 
     Count  11 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

time_bounds

(time, bnds)

float64

dask.array<chunksize=(3960, 2), meta=np.ndarray>

calendar :: noleap
units :: hours since 1850-01-01T02:00:00





  
      Array  Chunk 
  
  
     Bytes  63.36 kB   63.36 kB 
     Shape  (3960, 2)   (3960, 2) 
     Count  13 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float64

dask.array<chunksize=(384, 320), meta=np.ndarray>

axis :: Y
bounds :: lat_bnds
standard_name :: latitude
title :: Latitude
type :: double
units :: degrees_north
valid_max :: 90.0
valid_min :: -90.0





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   983.04 kB 
     Shape  (384, 320)   (384, 320) 
     Count  10 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 384, 320), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   491.52 kB 
     Shape  (2, 384, 320)   (1, 384, 320) 
     Count  18 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

lon_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 384, 320), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  983.04 kB   491.52 kB 
     Shape  (2, 384, 320)   (1, 384, 320) 
     Count  21 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

time
(time)
float64
0.0 707.0 ... 1.444e+06 1.445e+06
axis :
T
bounds :
time_bnds
calendar :
noleap
standard_name :
time
title :
time
type :
double
units :
days since 0001-01-01
```
array([0.000000e+00, 7.070000e+02, 1.415000e+03, ..., 1.443191e+06,
       1.443923e+06, 1.444655e+06])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float64

1.062 2.187 3.312 ... 358.8 359.9

array([  1.062475,   2.187474,   3.312474, ..., 357.687486, 358.812486,
       359.93748 ])

(y)

float64

-79.22 -78.69 ... 89.66 89.71

array([-79.220523, -78.686306, -78.15209 , ...,  89.107546,  89.663855,
        89.70641 ])

member_id

(member_id)

<U9

'r10i1p1f1' ... 'r9i1p1f1'

array(['r10i1p1f1', 'r11i1p1f1', 'r1i1p1f1', 'r2i1p1f1', 'r3i1p1f1',
       'r4i1p1f1', 'r5i1p1f1', 'r6i1p1f1', 'r7i1p1f1', 'r8i1p1f1', 'r9i1p1f1'],
      dtype='<U9')

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 400, 384, 320), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
comment :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
frequency :: mon
id :: tos
long_name :: Sea Surface Temperature
mipTable :: Omon
out_name :: tos
prov :: Omon ((isd.003))
realm :: ocean
standard_name :: sea_surface_temperature
time :: time
time_label :: time-mean
time_title :: Temporal mean
title :: Sea Surface Temperature
type :: real
units :: degC
variable_id :: tos





  
      Array  Chunk 
  
  
     Bytes  21.41 GB   196.61 MB 
     Shape  (11, 3960, 384, 320)   (1, 400, 384, 320) 
     Count  1578 Tasks  319 Chunks 
     Type  float32  numpy.ndarray

Attributes: (48)
institution :
National Center for Atmospheric Research, Climate and Global Dynamics Laboratory, 1850 Table Mesa Drive, Boulder, CO 80305, USA
history :
2019-08-02 08:55:11.664415 xarray.merge(<ALL_VARIABLES>)
nominal_resolution :
100 km
variable_id :
tos
cesm_casename :
b.e21.BHIST.f09_g17.CMIP6-historical.009
activity_id :
CMIP
variant_label :
r9i1p1f1
parent_variant_label :
r1i1p1f1
initialization_index :
1
status :
2019-10-25;created;by nhn2@columbia.edu
product :
model-output
model_doi_url :
https://doi.org/10.5065/D67H1H0V
source_type :
AOGCM BGC
tracking_id :
hdl:21.14100/5ac93391-d2b6-410a-8c32-27ac884c2d8d hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138 hdl:21.14100/174ee4b6-1e86-4062-a7af-317128c6a93b hdl:21.14100/e40efc72-6d42-465b-b305-54f19ef6ed1e hdl:21.14100/e5e905c2-7d89-46ec-83b2-400baaedca98 hdl:21.14100/b6db75cc-8b70-49a3-b115-d8ef87c3c263 hdl:21.14100/16619653-adf9-49ed-bf06-f945e1817fa0 hdl:21.14100/11ced3e0-5db9-440e-b2e7-7601e094e7d1 hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/1d80a8ed-a29d-4b94-8a43-fec67375189b hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/dd157381-e97c-4a64-bbce-7fc5645e3941 hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a hdl:21.14100/15072515-00b8-4ec7-8ffb-67c410e1766a
source :
CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); ocean: POP2 (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m); sea_ice: CICE5.1 (same grid as ocean); land: CLM5 0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); aerosol: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); atmoschem: MAM4 (0.9x1.25 finite volume grid; 288 x 192 longitude/latitude; 32 levels; top level 2.25 mb); landIce: CISM2.1; ocnBgchem: MARBL (320x384 longitude/latitude; 60 levels; top grid cell 0-10 m)
experiment_id :
historical
forcing_index :
1
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
674885.0
grid :
native gx1v7 displaced pole grid (384x320 latxlon)
mip_era :
CMIP6
case_id :
23
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r9i1p1f1
intake_esm_varname :
tos
table_id :
Omon
parent_source_id :
CESM2
realization_index :
9
variant_info :
CMIP6 20th century experiments (1850-2014) with CAM6, interactive land (CLM5), coupled ocean (POP2) with biogeochemistry (MARBL), interactive sea ice (CICE5.1), and non-evolving land ice (CISM2.1)
license :
CMIP6 model data produced by <The National Center for Atmospheric Research> is licensed under a Creative Commons Attribution-[]ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file)[]. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
grid_label :
gn
contact :
cesm_cmip6@ucar.edu
external_variables :
areacello
institution_id :
NCAR
branch_method :
standard
source_id :
CESM2
parent_time_units :
days since 0001-01-01 00:00:00
Conventions :
CF-1.7 CMIP-6.2
branch_time_in_parent :
295650.0
data_specs_version :
01.00.29
realm :
ocean
sub_experiment_id :
none
frequency :
mon
physics_index :
1
intake_esm_dataset_key :
CMIP.NCAR.CESM2.historical.Omon.gn

In [11]:

ddict_pp['CMIP.CNRM-CERFACS.CNRM-CM6-1.historical.Omon.gn']

Out[11]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 30
- time: 3959
- vertex: 4
- x: 362
- y: 294

Coordinates: (13)

lon

(y, x)

float64

dask.array<chunksize=(294, 362), meta=np.ndarray>

bounds :: bounds_lon
long_name :: Longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  851.42 kB   851.42 kB 
     Shape  (294, 362)   (294, 362) 
     Count  12 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_verticies

(y, x, vertex)

float64

dask.array<chunksize=(294, 362, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  3.41 MB   3.41 MB 
     Shape  (294, 362, 4)   (294, 362, 4) 
     Count  7 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lon_verticies

(y, x, vertex)

float64

dask.array<chunksize=(294, 362, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  3.41 MB   3.41 MB 
     Shape  (294, 362, 4)   (294, 362, 4) 
     Count  10 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time_bounds

(time, bnds)

float64

dask.array<chunksize=(3959, 2), meta=np.ndarray>

calendar :: gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  63.34 kB   63.34 kB 
     Shape  (3959, 2)   (3959, 2) 
     Count  13 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float64

dask.array<chunksize=(294, 362), meta=np.ndarray>

bounds :: bounds_lat
long_name :: Latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  851.42 kB   851.42 kB 
     Shape  (294, 362)   (294, 362) 
     Count  9 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_bounds

(bnds, y, x)

float64

dask.array<chunksize=(1, 294, 362), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  1.70 MB   851.42 kB 
     Shape  (2, 294, 362)   (1, 294, 362) 
     Count  17 Tasks  2 Chunks 
     Type  float64  numpy.ndarray

lon_bounds

(bnds, y, x)

float64

dask.array<chunksize=(1, 294, 362), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  1.70 MB   851.42 kB 
     Shape  (2, 294, 362)   (1, 294, 362) 
     Count  20 Tasks  2 Chunks 
     Type  float64  numpy.ndarray

time
(time)
float64
0.0 15.5 ... 1.445e+06 1.446e+06
axis :
T
bounds :
time_bounds
calendar :
gregorian
long_name :
Time axis
standard_name :
time
time_origin :
1850-01-01 00:00:00
units :
days since 1850-01-01
```
array([0.000000e+00, 1.550000e+01, 4.500000e+01, ..., 1.444152e+06,
       1.444884e+06, 1.445616e+06])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float64

0.5 1.5 2.5 ... 357.5 358.5 359.5

array([  0.5,   1.5,   2.5, ..., 357.5, 358.5, 359.5])

(y)

float64

-78.36 -78.17 -77.98 ... 89.74 90.0

array([-78.35936 , -78.171432, -77.98098 , ...,  89.366531,  89.741768,
        90.      ])

member_id

(member_id)

<U9

'r10i1p1f2' ... 'r9i1p1f2'

array(['r10i1p1f2', 'r11i1p1f2', 'r12i1p1f2', 'r13i1p1f2', 'r14i1p1f2',
       'r15i1p1f2', 'r16i1p1f2', 'r17i1p1f2', 'r18i1p1f2', 'r19i1p1f2',
       'r1i1p1f2', 'r20i1p1f2', 'r21i1p1f2', 'r22i1p1f2', 'r23i1p1f2',
       'r24i1p1f2', 'r25i1p1f2', 'r26i1p1f2', 'r27i1p1f2', 'r28i1p1f2',
       'r29i1p1f2', 'r2i1p1f2', 'r30i1p1f2', 'r3i1p1f2', 'r4i1p1f2',
       'r5i1p1f2', 'r6i1p1f2', 'r7i1p1f2', 'r8i1p1f2', 'r9i1p1f2'], dtype='<U9')

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 126, 294, 362), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
history :: none
interval_operation :: 1800 s
interval_write :: 1 month
long_name :: Sea Surface Temperature
online_operation :: average
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  50.56 GB   105.58 MB 
     Shape  (30, 3959, 294, 362)   (1, 248, 294, 362) 
     Count  4454 Tasks  930 Chunks 
     Type  float32  numpy.ndarray

Attributes: (51)
arpege_minor_version :
6.3.2
institution :
CNRM (Centre National de Recherches Meteorologiques, Toulouse 31057, France), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, Toulouse 31057, France)
history :
none Mon Jul 30 13:06:24 2018: ncatted -O -a tracking_id,global,m,c,hdl:21.14100/a3d87073-3388-4f02-a543-7879d04fa084 /scratch/work/voldoire/outputs/CMIP6/DECK/CNRM-CM6-1_historical_r1i1p1f2/assembled/tos_Omon_CNRM-CM6-1_historical_r1i1p1f2_gn_185001-201412.nc none none none none none none none none none none none none none none none none none none none none
nominal_resolution :
100 km
variable_id :
tos
activity_id :
CMIP
parent_variant_label :
r1i1p1f2
initialization_index :
1
status :
2019-11-11;created;by nhn2@columbia.edu
product :
model-output
references :
http://www.umr-cnrm.fr/cmip6/references
source_type :
AOGCM
dr2xml_md5sum :
92ddb3d0d8ce79f498d792fc8e559dcf
tracking_id :
hdl:21.14100/815bde79-bb52-48db-8104-8d4d52df5d0d hdl:21.14100/fdb5906a-908a-40bc-a332-45c7b2d5261e hdl:21.14100/536eff29-fec0-4553-b4b4-69ade4ca3cfd hdl:21.14100/62ad0ad9-1714-4f19-8806-c092effa780a hdl:21.14100/f3cd7b9f-3405-4026-9d19-de1750d3f69b hdl:21.14100/4eced8f9-fc97-4452-900f-b52371787b56 hdl:21.14100/69a274f8-71ec-4402-9086-f5b907f559e1 hdl:21.14100/1d31fc11-3770-409c-b385-dc78d37edf2e hdl:21.14100/f2c7a58d-0e95-4b71-9df3-d37aa7e9438e hdl:21.14100/e185742d-f1df-4bb0-b3e1-e0875a8fa3af hdl:21.14100/a3d87073-3388-4f02-a543-7879d04fa084 hdl:21.14100/4b9db366-1ff6-4ebe-8eb3-0a263bdaf7ad hdl:21.14100/b7614710-9f8d-4ef1-a1bc-841a67e1dd26 hdl:21.14100/6ced7db7-121c-4667-a40b-1ebd15741fb3 hdl:21.14100/992c36aa-a632-4ba9-8f5f-9beacbf0baaf hdl:21.14100/d015b520-96fa-4344-9116-3e8523314e2c hdl:21.14100/c94aa9d6-4126-4640-855f-cede3b705dcc hdl:21.14100/b69749d4-e709-4614-a718-2ff30a7f050f hdl:21.14100/7a70fb0f-da04-41ca-95b2-d87507bd1612 hdl:21.14100/bb3070b0-0e32-4a98-96be-faf5370a08bc hdl:21.14100/d7bbc018-8ebf-43e7-84a0-864c8ea41bf0 hdl:21.14100/c4e8f746-8dae-4fa7-ab5f-de7cbc4e39f5 hdl:21.14100/27e01d3a-b20f-4748-a7aa-e0a32e21d413 hdl:21.14100/b7114b56-47a2-4518-a72e-3697c732e45e hdl:21.14100/229d0004-efe1-4f25-8acd-72d98c421a67 hdl:21.14100/1fc6bdd4-2471-4210-9393-13e6236f43be hdl:21.14100/a7426751-501c-4e86-bcfa-5080e46110d7 hdl:21.14100/b2e51d20-e68b-48b5-81ae-cae790c71f76 hdl:21.14100/9e5f5f34-da9e-4674-9707-bba1b72e3e7c hdl:21.14100/443df451-a353-4689-aa8f-0d957452695e
source :
CNRM-CM6-1 (2017): aerosol: prescribed monthly fields computed by TACTIC_v2 scheme atmos: Arpege 6.3 (T127; Gaussian Reduced with 24572 grid points in total distributed over 128 latitude circles (with 256 grid points per latitude circle between 30degN and 30degS reducing to 20 grid points per latitude circle at 88.9degN and 88.9degS); 91 levels; top level 78.4 km) atmosChem: OZL_v2 land: Surfex 8.0c ocean: Nemo 3.6 (eORCA1, tripolar primarily 1deg; 362 x 294 longitude/latitude; 75 levels; top grid cell 0-1 m) seaIce: Gelato 6.1
experiment_id :
historical
forcing_index :
2
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
nemo_gelato_commit :
49095b3accd5d4c_6524fe19b00467a
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
0.0
grid :
native ocean tri-polar grid with 105 k ocean cells
mip_era :
CMIP6
xios_commit :
1442-shuffle
name :
/scratch/work/voldoire/outputs/CMIP6/DECK/CNRM-CM6-1_historical_r1i1p1f2/18500101/tos_Omon_CNRM-CM6-1_historical_r1i1p1f2_gn_%start_date%-%end_date%
intake_esm_varname :
tos
table_id :
Omon
parent_source_id :
CNRM-CM6-1
NCO :
"4.5.5"
license :
CMIP6 model data produced by CNRM-CERFACS is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at http://www.umr-cnrm.fr/cmip6/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
title :
CNRM-CM6-1 model output prepared for CMIP6 / CMIP historical
grid_label :
gn
contact :
contact.cmip@meteo.fr
external_variables :
areacello
institution_id :
CNRM-CERFACS
branch_method :
standard
source_id :
CNRM-CM6-1
parent_time_units :
days since 1850-01-01 00:00:00
dr2xml_version :
1.13
Conventions :
CF-1.7 CMIP-6.2
CMIP6_CV_version :
cv=6.2.3.0-7-g2019642
data_specs_version :
01.00.21
realm :
ocean
sub_experiment_id :
none
description :
CMIP6 historical
frequency :
mon
physics_index :
1
intake_esm_dataset_key :
CMIP.CNRM-CERFACS.CNRM-CM6-1.historical.Omon.gn

In [12]:

ddict_pp['CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn']

Out[12]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 3
- time: 1980
- vertex: 4
- x: 360
- y: 300

Coordinates: (13)

lon

(y, x)

float64

dask.array<chunksize=(300, 360), meta=np.ndarray>

bounds :: vertices_longitude
long_name :: longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  864.00 kB   864.00 kB 
     Shape  (300, 360)   (300, 360) 
     Count  12 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_verticies

(y, x, vertex)

float64

dask.array<chunksize=(300, 360, 4), meta=np.ndarray>

units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  3.46 MB   3.46 MB 
     Shape  (300, 360, 4)   (300, 360, 4) 
     Count  7 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lon_verticies

(y, x, vertex)

float64

dask.array<chunksize=(300, 360, 4), meta=np.ndarray>

units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  3.46 MB   3.46 MB 
     Shape  (300, 360, 4)   (300, 360, 4) 
     Count  10 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

time_bounds

(time, bnds)

float64

dask.array<chunksize=(1980, 2), meta=np.ndarray>

calendar :: proleptic_gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  31.68 kB   31.68 kB 
     Shape  (1980, 2)   (1980, 2) 
     Count  5 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float64

dask.array<chunksize=(300, 360), meta=np.ndarray>

bounds :: vertices_latitude
long_name :: latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  864.00 kB   864.00 kB 
     Shape  (300, 360)   (300, 360) 
     Count  9 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat_bounds

(bnds, y, x)

float64

dask.array<chunksize=(1, 300, 360), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  1.73 MB   864.00 kB 
     Shape  (2, 300, 360)   (1, 300, 360) 
     Count  17 Tasks  2 Chunks 
     Type  float64  numpy.ndarray

lon_bounds

(bnds, y, x)

float64

dask.array<chunksize=(1, 300, 360), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  1.73 MB   864.00 kB 
     Shape  (2, 300, 360)   (1, 300, 360) 
     Count  20 Tasks  2 Chunks 
     Type  float64  numpy.ndarray

time
(time)
int64
0 708 1416 ... 1444884 1445616
axis :
T
bounds :
time_bnds
calendar :
proleptic_gregorian
long_name :
time
standard_name :
time
units :
hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float64

0.5 1.5 2.5 ... 357.5 358.5 359.5

array([  0.5,   1.5,   2.5, ..., 357.5, 358.5, 359.5])

(y)

float64

-77.88 -77.63 ... 89.31 89.75

array([-77.876625, -77.629715, -77.381706, ...,  88.867493,  89.31498 ,
        89.748711])

member_id

(member_id)

<U8

'r1i1p1f1' 'r2i1p1f1' 'r3i1p1f1'

array(['r1i1p1f1', 'r2i1p1f1', 'r3i1p1f1'], dtype='<U8')

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 201, 300, 360), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
comment :: Temperature of upper boundary of the liquid ocean, including temperatures below sea-ice and floating ice shelves.
history :: 2019-11-15T15:21:40Z altered by CMOR: replaced missing value flag (-1e+20) with standard missing value (1e+20).
long_name :: Sea Surface Temperature
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  2.57 GB   86.83 MB 
     Shape  (3, 1980, 300, 360)   (1, 201, 300, 360) 
     Count  123 Tasks  30 Chunks 
     Type  float32  numpy.ndarray

Attributes: (50)
institution :
Commonwealth Scientific and Industrial Research Organisation, Aspendale, Victoria 3195, Australia
history :
2019-11-15T15:21:42Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards. 2019-11-28T13:20:42Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards. 2019-12-03T10:07:56Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.
nominal_resolution :
250 km
variable_id :
tos
cmor_version :
3.4.0
activity_id :
CMIP
variant_label :
r3i1p1f1
parent_variant_label :
r1i1p1f1
initialization_index :
1
table_info :
Creation Date:(30 April 2019) MD5:e14f55f257cceafb2523e41244962371
status :
2020-05-09;created; by gcs.cmip6.ldeo@gmail.com
product :
model-output
source_type :
AOGCM
creation_date :
2019-12-03T10:07:56Z
tracking_id :
hdl:21.14100/02850fcc-be64-40de-b7ca-9b8aa6e688a0 hdl:21.14100/999d23f9-89c9-4208-888c-fd91ae8a5ed3 hdl:21.14100/12f6dd76-c67f-4cb7-b6fe-ab080d29ace2
source :
ACCESS-ESM1.5 (2019): aerosol: CLASSIC (v1.0) atmos: HadGAM2 (r1.1, N96; 192 x 145 longitude/latitude; 38 levels; top level 39255 m) atmosChem: none land: CABLE2.4 landIce: none ocean: ACCESS-OM2 (MOM5, tripolar primarily 1deg; 360 x 300 longitude/latitude; 50 levels; top grid cell 0-10 m) ocnBgchem: WOMBAT (same grid as ocean) seaIce: CICE4.1 (same grid as ocean)
experiment_id :
historical
forcing_index :
1
parent_activity_id :
CMIP
parent_mip_era :
CMIP6
parent_experiment_id :
piControl
sub_experiment :
none
branch_time_in_child :
0.0
grid :
native atmosphere N96 grid (145x192 latxlon)
mip_era :
CMIP6
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.CSIRO.ACCESS-ESM1-5.historical.none.r3i1p1f1
intake_esm_varname :
tos
version :
v20191203
table_id :
Omon
parent_source_id :
ACCESS-ESM1-5
notes :
Exp: ESM-historical; Local ID: HI-07; Variable: tos (['sst'])
realization_index :
3
license :
CMIP6 model data produced by CSIRO is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
experiment :
all-forcing simulation of the recent past
title :
ACCESS-ESM1-5 output prepared for CMIP6
grid_label :
gn
external_variables :
areacello
institution_id :
CSIRO
branch_method :
standard
source_id :
ACCESS-ESM1-5
parent_time_units :
days since 0101-1-1
Conventions :
CF-1.7 CMIP-6.2
branch_time_in_parent :
36524.0
data_specs_version :
01.00.30
realm :
ocean
sub_experiment_id :
none
frequency :
mon
physics_index :
1
run_variant :
forcing: GHG, Oz, SA, Sl, Vl, BC, OC, (GHG = CO2, N2O, CH4, CFC11, CFC12, CFC113, HCFC22, HFC125, HFC134a)
intake_esm_dataset_key :
CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn

Much better!

As you can see, they all have consistent dimension names and coordinates. Time to see if this works as advertised, by plotting the sea surface temperature (SST) for all models

In [13]:

import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

fig, axarr = plt.subplots(
    ncols=5, nrows=6, figsize=[15, 15], subplot_kw={"projection": ccrs.Robinson(190)}
)
for ax, (k, ds) in zip(axarr.flat, ddict_pp.items()):
    # Select a single member for each model
    if 'member_id' in ds.dims:
        ds = ds.isel(member_id=-1)

    # select the first time step
    da = ds.tos.isel(time=0)
    
    # some models have large values instead of nans, so we mask unresonable values
    da = da.where(da < 1e30)
    
    # and plot the resulting 2d array using xarray
    pp = da.plot(
        ax=ax,
        x="lon",
        y="lat",
        vmin=-1,
        vmax=28,
        transform=ccrs.PlateCarree(),
        infer_intervals=False,
        add_colorbar=False,
    )
    
    ax.set_title(ds.attrs['source_id'])
    ax.add_feature(cfeature.LAND, facecolor='0.6')
    ax.coastlines()
fig.subplots_adjust(hspace=0.05, wspace=0.05)
fig.subplots_adjust(right=0.9)
cbar_ax = fig.add_axes([0.92, 0.3, 0.015, 0.4])
fig.colorbar(pp, cax=cbar_ax, label="Sea Surface Temperature [$^\circ C$]")

Out[13]:

<matplotlib.colorbar.Colorbar at 0x7f907aa4edd8>

That was very little code to plot such a comprehensive figure, even if the models all have different resolution and grid architectures.

cmip6_preprocessing helps keeping the logic needed for simple analysis and visualization to a minimum, but in the newest version is also helps to facilitate the use of xgcm with the native model grids, making it a powerful tool for more complex analysis across the models.

Calculations on native model grids using `xgcm`.¶

Most modern circulation models discretize the partial differential equations needed to simulate the earth system on a logically rectangular grid. This means the grid for a single time step can be represented as a 3-dimensional array of cells. Operations like e.g., a derivative are then approximated by a finite difference between neighboring cells, divided by the appropriate distance.

Calculating operators like gradient, curl or divergence is usually associated with a lot of bookkeeping to make sure that the difference is taken between the correct grid points, etc. This often leads to users preferring interpolated grids (on regular lon/lat grids), which have to be processed and stored next to the native grid data, both increasing storage requirements and potentially losing some of the detailed structure in the high-resolution model fields.

The combination of cmip6_preprocessing and xgcm enables the user to quickly calculate operators like gradient or curl (shown in detail below) on the native model grids, preserving the maximum detail the models provide and preventing unnecessary storage burden. cmip6_preprocessing parses the detailed grid information for the different models so that xgcm can carry out the computations, without requiring the user to dive into the details of each model grid.

As an example, let us compute and plot the gradient magnitude of the SST.

$ \overline{\nabla(SST)} = \sqrt{\frac{\partial SST}{\partial x}^2 + \frac{\partial SST}{\partial y}^2} $

We will start with an example of just computing the zonal gradient $\frac{\partial SST}{\partial x}$:

First, we need to create a suitable grid, with dimensions both for the cell center (where our tracer is located) and the cell faces. These are needed for xgcm to calculate the finite distance version of the above equation.

The function combine_staggered_grid parses a dataset so that it is compatible with xgcm. It also provides a 'ready-to-use' xgm Grid object for convenience.

In [14]:

# we pick one of the many models
ds = ddict_pp['CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.historical.Omon.gn']
ds

Out[14]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 1
- time: 1980
- vertex: 4
- x: 1442
- y: 1050

Coordinates: (13)

lon

(y, x)

float32

dask.array<chunksize=(1050, 1442), meta=np.ndarray>

bounds :: bounds_lon
long_name :: Longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  6.06 MB   6.06 MB 
     Shape  (1050, 1442)   (1050, 1442) 
     Count  12 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lat_verticies

(y, x, vertex)

float32

dask.array<chunksize=(1050, 1442, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  24.23 MB   24.23 MB 
     Shape  (1050, 1442, 4)   (1050, 1442, 4) 
     Count  7 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lon_verticies

(y, x, vertex)

float32

dask.array<chunksize=(1050, 1442, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  24.23 MB   24.23 MB 
     Shape  (1050, 1442, 4)   (1050, 1442, 4) 
     Count  10 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

time_bounds

(time, bnds)

float64

dask.array<chunksize=(1980, 2), meta=np.ndarray>

calendar :: gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  31.68 kB   31.68 kB 
     Shape  (1980, 2)   (1980, 2) 
     Count  5 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float32

dask.array<chunksize=(1050, 1442), meta=np.ndarray>

bounds :: bounds_lat
long_name :: Latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  6.06 MB   6.06 MB 
     Shape  (1050, 1442)   (1050, 1442) 
     Count  9 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lat_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 1050, 1442), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (2, 1050, 1442)   (1, 1050, 1442) 
     Count  17 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

lon_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 1050, 1442), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (2, 1050, 1442)   (1, 1050, 1442) 
     Count  20 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

time
(time)
int64
0 708 1416 ... 1444884 1445616
axis :
T
bounds :
time_bounds
calendar :
gregorian
long_name :
Time axis
standard_name :
time
time_origin :
1850-01-01 00:00:00
units :
hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float32

0.25 0.5 0.75 ... 359.75 360.0

array([2.5000e-01, 5.0000e-01, 7.5000e-01, ..., 3.5950e+02, 3.5975e+02,
       3.6000e+02], dtype=float32)

(y)

float32

-78.107124 -78.05977 ... 90.0

array([-78.107124, -78.05977 , -78.01226 , ...,  89.88762 ,  89.94787 ,
        90.      ], dtype=float32)

member_id
(member_id)
<U8
'r1i1p1f2'
```
array(['r1i1p1f2'], dtype='<U8')
```

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 20, 1050, 1442), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
history :: none
interval_operation :: 900 s
interval_write :: 1 month
long_name :: Sea Surface Temperature
online_operation :: average
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.13 MB 
     Shape  (1, 1980, 1050, 1442)   (1, 20, 1050, 1442) 
     Count  298 Tasks  99 Chunks 
     Type  float32  numpy.ndarray

Attributes: (55)
CMIP6_CV_version :
cv=6.2.3.0-7-g2019642
Conventions :
CF-1.7 CMIP-6.2
EXPID :
CNRM-CM6-1-HR_historical_r1i1p1f2
activity_id :
CMIP
arpege_minor_version :
6.3.3
branch_method :
standard
branch_time_in_child :
0.0
branch_time_in_parent :
0.0
contact :
contact.cmip@meteo.fr
creation_date :
2019-05-23T13:08:54Z
data_specs_version :
01.00.21
description :
CMIP6 historical
dr2xml_md5sum :
45d4369d889ddfb8149d771d8625e9ec
dr2xml_version :
1.16
experiment :
all-forcing simulation of the recent past
experiment_id :
historical
external_variables :
areacello
forcing_index :
2
frequency :
mon
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.CNRM-CERFACS.CNRM-CM6-1-HR.historical.none.r1i1p1f2
grid :
native ocean tri-polar grid with 1.47 M ocean cells
grid_label :
gn
history :
none
initialization_index :
1
institution :
CNRM (Centre National de Recherches Meteorologiques, Toulouse 31057, France), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, Toulouse 31057, France)
institution_id :
CNRM-CERFACS
license :
CMIP6 model data produced by CNRM-CERFACS is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at http://www.umr-cnrm.fr/cmip6/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
mip_era :
CMIP6
nemo_gelato_commit :
84a9e3f161dade7_8250e198106a168
nominal_resolution :
25 km
parent_activity_id :
C M I P
parent_experiment_id :
p i C o n t r o l
parent_mip_era :
CMIP6
parent_source_id :
CNRM-CM6-1-HR
parent_time_units :
days since 1850-01-01 00:00:00
parent_variant_label :
r1i1p1f2
physics_index :
1
product :
model-output
realization_index :
1
realm :
ocean
references :
http://www.umr-cnrm.fr/cmip6/references
source :
CNRM-CM6-1-HR (2017): aerosol: prescribed monthly fields computed by TACTIC_v2 scheme atmos: Arpege 6.3 (T359; Gaussian Reduced with 181724 grid points in total distributed over 360 latitude circles (with 720 grid points per latitude circle between 32.2degN and 32.2degS reducing to 18 grid points per latitude circle at 89.6degN and 89.6degS); 91 levels; top level 78.4 km) atmosChem: OZL_v2 land: Surfex 8.0c ocean: Nemo 3.6 (eORCA025, tripolar primarily 1/4deg; 1442 x 1050 longitude/latitude; 75 levels; top grid cell 0-1 m) seaIce: Gelato 6.1
source_id :
CNRM-CM6-1-HR
source_type :
AOGCM
status :
2020-05-09;created; by gcs.cmip6.ldeo@gmail.com
sub_experiment :
none
sub_experiment_id :
none
table_id :
Omon
title :
CNRM-CM6-1-HR model output prepared for CMIP6 and CMIP / historical simulation
tracking_id :
hdl:21.14100/ea182917-b175-4bbd-bb83-f127d71fa772
variable_id :
tos
variant_label :
r1i1p1f2
xios_commit :
1442-shuffle
intake_esm_varname :
tos
intake_esm_dataset_key :
CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.historical.Omon.gn

In [15]:

from cmip6_preprocessing.grids import combine_staggered_grid

# Now we parse the necessary additional dimensions
grid,ds = combine_staggered_grid(ds)
ds

Out[15]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 1
- time: 1980
- vertex: 4
- x: 1442
- x_left: 1442
- y: 1050
- y_right: 1050

Coordinates: (15)

lon

(y, x)

float32

dask.array<chunksize=(1050, 1442), meta=np.ndarray>

bounds :: bounds_lon
long_name :: Longitude
standard_name :: longitude
units :: degrees_east





  
      Array  Chunk 
  
  
     Bytes  6.06 MB   6.06 MB 
     Shape  (1050, 1442)   (1050, 1442) 
     Count  12 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lat_verticies

(y, x, vertex)

float32

dask.array<chunksize=(1050, 1442, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  24.23 MB   24.23 MB 
     Shape  (1050, 1442, 4)   (1050, 1442, 4) 
     Count  7 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lon_verticies

(y, x, vertex)

float32

dask.array<chunksize=(1050, 1442, 4), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  24.23 MB   24.23 MB 
     Shape  (1050, 1442, 4)   (1050, 1442, 4) 
     Count  10 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

time_bounds

(time, bnds)

float64

dask.array<chunksize=(1980, 2), meta=np.ndarray>

calendar :: gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  31.68 kB   31.68 kB 
     Shape  (1980, 2)   (1980, 2) 
     Count  5 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float32

dask.array<chunksize=(1050, 1442), meta=np.ndarray>

bounds :: bounds_lat
long_name :: Latitude
standard_name :: latitude
units :: degrees_north





  
      Array  Chunk 
  
  
     Bytes  6.06 MB   6.06 MB 
     Shape  (1050, 1442)   (1050, 1442) 
     Count  9 Tasks  1 Chunks 
     Type  float32  numpy.ndarray

lat_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 1050, 1442), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (2, 1050, 1442)   (1, 1050, 1442) 
     Count  17 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

lon_bounds

(bnds, y, x)

float32

dask.array<chunksize=(1, 1050, 1442), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (2, 1050, 1442)   (1, 1050, 1442) 
     Count  20 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

time
(time)
int64
0 708 1416 ... 1444884 1445616
axis :
T
bounds :
time_bounds
calendar :
gregorian
long_name :
Time axis
standard_name :
time
time_origin :
1850-01-01 00:00:00
units :
hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float32

0.25 0.5 0.75 ... 359.75 360.0

axis :: X

array([2.5000e-01, 5.0000e-01, 7.5000e-01, ..., 3.5950e+02, 3.5975e+02,
       3.6000e+02], dtype=float32)

(y)

float32

-78.107124 -78.05977 ... 90.0

axis :: Y

array([-78.107124, -78.05977 , -78.01226 , ...,  89.88762 ,  89.94787 ,
        90.      ], dtype=float32)

member_id
(member_id)
<U8
'r1i1p1f2'
```
array(['r1i1p1f2'], dtype='<U8')
```

x_left

(x_left)

float32

0.125 0.375 ... 359.625 359.875

axis :: X
c_grid_axis_shift :: -0.5

array([1.25000e-01, 3.75000e-01, 6.25000e-01, ..., 3.59375e+02, 3.59625e+02,
       3.59875e+02], dtype=float32)

y_right

(y_right)

float32

-78.08345 -78.03601 ... 90.02606

axis :: Y
c_grid_axis_shift :: 0.5

array([-78.08345, -78.03601, -77.98842, ...,  89.91774,  89.97394,  90.02606],
      dtype=float32)

Data variables: (1)

tos

(member_id, time, y, x)

float32

dask.array<chunksize=(1, 20, 1050, 1442), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
history :: none
interval_operation :: 900 s
interval_write :: 1 month
long_name :: Sea Surface Temperature
online_operation :: average
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.13 MB 
     Shape  (1, 1980, 1050, 1442)   (1, 20, 1050, 1442) 
     Count  298 Tasks  99 Chunks 
     Type  float32  numpy.ndarray

Attributes: (55)
CMIP6_CV_version :
cv=6.2.3.0-7-g2019642
Conventions :
CF-1.7 CMIP-6.2
EXPID :
CNRM-CM6-1-HR_historical_r1i1p1f2
activity_id :
CMIP
arpege_minor_version :
6.3.3
branch_method :
standard
branch_time_in_child :
0.0
branch_time_in_parent :
0.0
contact :
contact.cmip@meteo.fr
creation_date :
2019-05-23T13:08:54Z
data_specs_version :
01.00.21
description :
CMIP6 historical
dr2xml_md5sum :
45d4369d889ddfb8149d771d8625e9ec
dr2xml_version :
1.16
experiment :
all-forcing simulation of the recent past
experiment_id :
historical
external_variables :
areacello
forcing_index :
2
frequency :
mon
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.CNRM-CERFACS.CNRM-CM6-1-HR.historical.none.r1i1p1f2
grid :
native ocean tri-polar grid with 1.47 M ocean cells
grid_label :
gn
history :
none
initialization_index :
1
institution :
CNRM (Centre National de Recherches Meteorologiques, Toulouse 31057, France), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, Toulouse 31057, France)
institution_id :
CNRM-CERFACS
license :
CMIP6 model data produced by CNRM-CERFACS is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at http://www.umr-cnrm.fr/cmip6/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
mip_era :
CMIP6
nemo_gelato_commit :
84a9e3f161dade7_8250e198106a168
nominal_resolution :
25 km
parent_activity_id :
C M I P
parent_experiment_id :
p i C o n t r o l
parent_mip_era :
CMIP6
parent_source_id :
CNRM-CM6-1-HR
parent_time_units :
days since 1850-01-01 00:00:00
parent_variant_label :
r1i1p1f2
physics_index :
1
product :
model-output
realization_index :
1
realm :
ocean
references :
http://www.umr-cnrm.fr/cmip6/references
source :
CNRM-CM6-1-HR (2017): aerosol: prescribed monthly fields computed by TACTIC_v2 scheme atmos: Arpege 6.3 (T359; Gaussian Reduced with 181724 grid points in total distributed over 360 latitude circles (with 720 grid points per latitude circle between 32.2degN and 32.2degS reducing to 18 grid points per latitude circle at 89.6degN and 89.6degS); 91 levels; top level 78.4 km) atmosChem: OZL_v2 land: Surfex 8.0c ocean: Nemo 3.6 (eORCA025, tripolar primarily 1/4deg; 1442 x 1050 longitude/latitude; 75 levels; top grid cell 0-1 m) seaIce: Gelato 6.1
source_id :
CNRM-CM6-1-HR
source_type :
AOGCM
status :
2020-05-09;created; by gcs.cmip6.ldeo@gmail.com
sub_experiment :
none
sub_experiment_id :
none
table_id :
Omon
title :
CNRM-CM6-1-HR model output prepared for CMIP6 and CMIP / historical simulation
tracking_id :
hdl:21.14100/ea182917-b175-4bbd-bb83-f127d71fa772
variable_id :
tos
variant_label :
r1i1p1f2
xios_commit :
1442-shuffle
intake_esm_varname :
tos
intake_esm_dataset_key :
CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.historical.Omon.gn

Note how the new dimensions x_left and y_right are added, which are associated with the 'eastern' and 'northern' cell faces. We can now easily calculate the finite difference in the logical X direction:

In [16]:

delta_t_x = grid.diff(ds.tos, 'X')
delta_t_x

Out[16]:

xarray.DataArray

'sub-831cd583ca34002d4e8742a331286368'

member_id: 1
time: 1980
y: 1050
x_left: 1442

dask.array<chunksize=(1, 20, 1050, 1), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.04 MB 
     Shape  (1, 1980, 1050, 1442)   (1, 20, 1050, 1441) 
     Count  1486 Tasks  198 Chunks 
     Type  float32  numpy.ndarray

Coordinates: (4)
- member_id
  (member_id)
  <U8
  'r1i1p1f2'
```
array(['r1i1p1f2'], dtype='<U8')
```
- time
  (time)
  int64
  0 708 1416 ... 1444884 1445616
  axis :
  T
  bounds :
  time_bounds
  calendar :
  gregorian
  long_name :
  Time axis
  standard_name :
  time
  time_origin :
  1850-01-01 00:00:00
  units :
  hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
- y
  (y)
  float32
  -78.107124 -78.05977 ... 90.0
  axis :
  Y
```
array([-78.107124, -78.05977 , -78.01226 , ...,  89.88762 ,  89.94787 ,
        90.      ], dtype=float32)
```
- x_left
  (x_left)
  float32
  0.125 0.375 ... 359.625 359.875
  axis :
  X
  c_grid_axis_shift :
  -0.5
```
array([1.25000e-01, 3.75000e-01, 6.25000e-01, ..., 3.59375e+02, 3.59625e+02,
       3.59875e+02], dtype=float32)
```
Attributes: (0)

This array is now located on the center of the eastern cell face. But in order to recreate a derivative, we need to divide this difference by an appropriate distance. Usually these distances are provided with model output, but currently there is no such data publicly available for CMIP models.

To solve this problem, cmip6_preprocessing can recalculate grid distances.

Be aware that this can lead to rather large biases when the grid is strongly warped, usually around the Arctic. More on that at the end of the notebook.

In [17]:

grid, ds = combine_staggered_grid(ds, recalculate_metrics=True)

In [18]:

ds

Out[18]:

xarray.Dataset

Dimensions:
- bnds: 2
- member_id: 1
- time: 1980
- vertex: 4
- x: 1442
- x_left: 1442
- y: 1050
- y_right: 1050

Coordinates: (23)

lon

(y, x)

float32

1.4255313 1.672195 ... 73.59864

bounds :: bounds_lon
long_name :: Longitude
standard_name :: longitude
units :: degrees_east

array([[ 1.4255313 ,  1.672195  ,  1.9187545 , ...,  0.6849076 ,
         0.9318883 ,  1.1787626 ],
       [ 1.3997855 ,  1.646533  ,  1.8931786 , ...,  0.65892404,
         0.9059817 ,  1.1529354 ],
       [ 1.3744365 ,  1.6212662 ,  1.8679961 , ...,  0.6333417 ,
         0.8804749 ,  1.1275063 ],
       ...,
       [72.40824   , 72.41498   , 72.42157   , ..., 72.38714   ,
        72.39433   , 72.40136   ],
       [73.        , 73.        , 73.        , ..., 73.        ,
        73.        , 73.        ],
       [73.59176   , 73.58502   , 73.57843   , ..., 73.61286   ,
        73.60567   , 73.59864   ]], dtype=float32)

lat_verticies

(y, x, vertex)

float32

-78.02738 -78.02738 ... 80.10339

array([[[-78.02738 , -78.02738 , -78.02738 , -78.02738 ],
        [-78.02127 , -78.02127 , -78.02127 , -78.02127 ],
        [-78.01517 , -78.01517 , -78.01517 , -78.01517 ],
        ...,
        [-78.045746, -78.045746, -78.045746, -78.045746],
        [-78.03962 , -78.03962 , -78.03962 , -78.03962 ],
        [-78.03349 , -78.03349 , -78.03349 , -78.03349 ]],

       [[-77.99497 , -77.94998 , -77.943886, -77.9888  ],
        [-77.9888  , -77.943886, -77.9378  , -77.982635],
        [-77.982635, -77.9378  , -77.93172 , -77.97648 ],
        ...,
        [-78.01352 , -77.96831 , -77.9622  , -78.00733 ],
        [-78.00733 , -77.9622  , -77.956085, -78.00115 ],
        [-78.00115 , -77.956085, -77.94998 , -77.99497 ]],

       [[-77.94998 , -77.90479 , -77.898766, -77.943886],
        [-77.943886, -77.898766, -77.89275 , -77.9378  ],
        [-77.9378  , -77.89275 , -77.88675 , -77.93172 ],
        ...,
        [-77.96831 , -77.92289 , -77.91685 , -77.9622  ],
        [-77.9622  , -77.91685 , -77.91082 , -77.956085],
        [-77.956085, -77.91082 , -77.90479 , -77.94998 ]],

       ...,

       [[ 80.10231 ,  80.10339 ,  79.99083 ,  79.98976 ],
        [ 79.98976 ,  79.99083 ,  79.87826 ,  79.8772  ],
        [ 79.8772  ,  79.87826 ,  79.76567 ,  79.764626],
        ...,
        [ 80.43989 ,  80.441   ,  80.32848 ,  80.32738 ],
        [ 80.32738 ,  80.32848 ,  80.21594 ,  80.21485 ],
        [ 80.21485 ,  80.21594 ,  80.10339 ,  80.10231 ]],

       [[ 80.10339 ,  80.10339 ,  79.99083 ,  79.99083 ],
        [ 79.99083 ,  79.99083 ,  79.87826 ,  79.87826 ],
        [ 79.87826 ,  79.87826 ,  79.76567 ,  79.76567 ],
        ...,
        [ 80.441   ,  80.441   ,  80.32848 ,  80.32848 ],
        [ 80.32848 ,  80.32848 ,  80.21594 ,  80.21594 ],
        [ 80.21594 ,  80.21594 ,  80.10339 ,  80.10339 ]],

       [[ 80.10339 ,  80.10231 ,  79.98976 ,  79.99083 ],
        [ 79.99083 ,  79.98976 ,  79.8772  ,  79.87826 ],
        [ 79.87826 ,  79.8772  ,  79.764626,  79.76567 ],
        ...,
        [ 80.441   ,  80.43989 ,  80.32738 ,  80.32848 ],
        [ 80.32848 ,  80.32738 ,  80.21485 ,  80.21594 ],
        [ 80.21594 ,  80.21485 ,  80.10231 ,  80.10339 ]]], dtype=float32)

lon_verticies

(y, x, vertex)

float32

1.4255313 1.4255313 ... 73.29744

array([[[ 1.4255313 ,  1.4255313 ,  1.4255313 ,  1.4255313 ],
        [ 1.672195  ,  1.672195  ,  1.672195  ,  1.672195  ],
        [ 1.9187545 ,  1.9187545 ,  1.9187545 ,  1.9187545 ],
        ...,
        [ 0.6849076 ,  0.6849076 ,  0.6849076 ,  0.6849076 ],
        [ 0.9318883 ,  0.9318883 ,  0.9318883 ,  0.9318883 ],
        [ 1.1787626 ,  1.1787626 ,  1.1787626 ,  1.1787626 ]],

       [[ 1.3118777 ,  1.2859744 ,  1.5326287 ,  1.5584462 ],
        [ 1.5584462 ,  1.5326287 ,  1.77918   ,  1.8049096 ],
        [ 1.8049096 ,  1.77918   ,  2.0256293 ,  2.0512688 ],
        ...,
        [ 0.57153535,  0.54538834,  0.792355  ,  0.81842303],
        [ 0.81842303,  0.792355  ,  1.0392168 ,  1.0652037 ],
        [ 1.0652037 ,  1.0392168 ,  1.2859744 ,  1.3118777 ]],

       [[ 1.2859744 ,  1.2604687 ,  1.507207  ,  1.5326287 ],
        [ 1.5326287 ,  1.507207  ,  1.7538447 ,  1.77918   ],
        [ 1.77918   ,  1.7538447 ,  2.0003824 ,  2.0256293 ],
        ...,
        [ 0.54538834,  0.51964366,  0.76668775,  0.792355  ],
        [ 0.792355  ,  0.76668775,  1.0136292 ,  1.0392168 ],
        [ 1.0392168 ,  1.0136292 ,  1.2604687 ,  1.2859744 ]],

       ...,

       [[72.1063    , 72.70256   , 72.70596   , 72.11652   ],
        [72.11652   , 72.70596   , 72.7093    , 72.12653   ],
        [72.12653   , 72.7093    , 72.712555  , 72.13633   ],
        ...,
        [72.07427   , 72.6919    , 72.69553   , 72.08518   ],
        [72.08518   , 72.69553   , 72.69909   , 72.09586   ],
        [72.09586   , 72.69909   , 72.70256   , 72.1063    ]],

       [[72.70256   , 73.29744   , 73.29404   , 72.70596   ],
        [72.70596   , 73.29404   , 73.2907    , 72.7093    ],
        [72.7093    , 73.2907    , 73.287445  , 72.712555  ],
        ...,
        [72.6919    , 73.3081    , 73.30447   , 72.69553   ],
        [72.69553   , 73.30447   , 73.30091   , 72.69909   ],
        [72.69909   , 73.30091   , 73.29744   , 72.70256   ]],

       [[73.29744   , 73.8937    , 73.88348   , 73.29404   ],
        [73.29404   , 73.88348   , 73.87347   , 73.2907    ],
        [73.2907    , 73.87347   , 73.86367   , 73.287445  ],
        ...,
        [73.3081    , 73.92573   , 73.91482   , 73.30447   ],
        [73.30447   , 73.91482   , 73.90414   , 73.30091   ],
        [73.30091   , 73.90414   , 73.8937    , 73.29744   ]]],
      dtype=float32)

time_bounds

(bnds, time)

float64

dask.array<chunksize=(2, 1980), meta=np.ndarray>

calendar :: gregorian
units :: days since 1850-01-01





  
      Array  Chunk 
  
  
     Bytes  31.68 kB   31.68 kB 
     Shape  (2, 1980)   (2, 1980) 
     Count  6 Tasks  1 Chunks 
     Type  float64  numpy.ndarray

lat

(y, x)

float32

-78.02738 -78.02127 ... 80.15926

bounds :: bounds_lat
long_name :: Latitude
standard_name :: latitude
units :: degrees_north

array([[-78.02738 , -78.02127 , -78.01517 , ..., -78.045746, -78.03962 ,
        -78.03349 ],
       [-77.982445, -77.9764  , -77.970375, ..., -78.00058 , -77.99453 ,
        -77.98848 ],
       [-77.9373  , -77.93134 , -77.925385, ..., -77.95522 , -77.94924 ,
        -77.94327 ],
       ...,
       [ 80.04671 ,  79.93414 ,  79.82157 , ...,  80.38433 ,  80.2718  ,
         80.15926 ],
       [ 80.04725 ,  79.93468 ,  79.8221  , ...,  80.38488 ,  80.272354,
         80.159805],
       [ 80.04671 ,  79.93414 ,  79.82157 , ...,  80.38433 ,  80.2718  ,
         80.15926 ]], dtype=float32)

lat_bounds

(y, x, bnds)

float32

dask.array<chunksize=(1050, 1442, 1), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (1050, 1442, 2)   (1050, 1442, 1) 
     Count  19 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

lon_bounds

(y, x, bnds)

float32

dask.array<chunksize=(1050, 1442, 1), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  12.11 MB   6.06 MB 
     Shape  (1050, 1442, 2)   (1050, 1442, 1) 
     Count  22 Tasks  2 Chunks 
     Type  float32  numpy.ndarray

time
(time)
int64
0 708 1416 ... 1444884 1445616
axis :
T
bounds :
time_bounds
calendar :
gregorian
long_name :
Time axis
standard_name :
time
time_origin :
1850-01-01 00:00:00
units :
hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
vertex
(vertex)
int64
0 1 2 3
```
array([0, 1, 2, 3])
```
bnds
(bnds)
int64
0 1
```
array([0, 1])
```

(x)

float32

0.25 0.5 0.75 ... 359.75 360.0

axis :: X

array([2.5000e-01, 5.0000e-01, 7.5000e-01, ..., 3.5950e+02, 3.5975e+02,
       3.6000e+02], dtype=float32)

(y)

float32

-78.107124 -78.05977 ... 90.0

axis :: Y

array([-78.107124, -78.05977 , -78.01226 , ...,  89.88762 ,  89.94787 ,
        90.      ], dtype=float32)

member_id
(member_id)
<U8
'r1i1p1f2'
```
array(['r1i1p1f2'], dtype='<U8')
```

x_left

(x_left)

float32

0.125 0.375 ... 359.625 359.875

axis :: X
c_grid_axis_shift :: -0.5

array([1.25000e-01, 3.75000e-01, 6.25000e-01, ..., 3.59375e+02, 3.59625e+02,
       3.59875e+02], dtype=float32)

y_right

(y_right)

float32

-78.08345 -78.03601 ... 90.02606

axis :: Y
c_grid_axis_shift :: 0.5

array([-78.08345, -78.03601, -77.98842, ...,  89.91774,  89.97394,  90.02606],
      dtype=float32)

dx_t

(y, x)

float32

0.0 0.0 0.0 ... 12526.91 12527.76

array([[    0.    ,     0.    ,     0.    , ...,     0.    ,     0.    ,
            0.    ],
       [ 5720.5063,  5720.9644,  5721.434 , ...,  5719.176 ,  5719.607 ,
         5720.047 ],
       [ 5743.611 ,  5744.0796,  5744.556 , ...,  5742.2534,  5742.6973,
         5743.1426],
       ...,
       [12529.459 , 12530.309 , 12532.007 , ..., 12525.213 , 12526.91  ,
        12527.76  ],
       [12530.309 , 12531.156 , 12532.856 , ..., 12525.213 , 12527.76  ,
        12528.609 ],
       [12529.459 , 12530.309 , 12532.007 , ..., 12525.213 , 12526.91  ,
        12527.76  ]], dtype=float32)

dy_t

(y, x)

float32

0.0 0.0 0.0 ... 11413.054 11409.701

array([[    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       [    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       [    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       ...,
       [11407.564, 11403.865, 11400.074, ..., 11417.494, 11414.332,
        11410.939],
       [11379.927, 11376.405, 11372.645, ..., 11389.674, 11386.506,
        11383.429],
       [11406.348, 11402.679, 11398.908, ..., 11416.201, 11413.054,
        11409.701]], dtype=float32)

dx_gy

(y_right, x)

float32

0.0 0.0 0.0 ... 12526.062 12527.76

array([[    0.    ,     0.    ,     0.    , ...,     0.    ,     0.    ,
            0.    ],
       [ 5732.0396,  5732.5034,  5732.9805, ...,  5730.7   ,  5731.1333,
         5731.582 ],
       [ 5755.176 ,  5755.6553,  5756.1357, ...,  5753.817 ,  5754.259 ,
         5754.7114],
       ...,
       [12530.309 , 12531.156 , 12532.856 , ..., 12525.213 , 12527.76  ,
        12528.609 ],
       [12530.309 , 12531.156 , 12532.856 , ..., 12525.213 , 12527.76  ,
        12528.609 ],
       [12528.609 , 12530.309 , 12531.156 , ..., 12524.362 , 12526.062 ,
        12527.76  ]], dtype=float32)

dy_gx

(y, x_left)

float32

0.0 0.0 0.0 ... 11414.309 11411.036

array([[    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       [    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       [    0.   ,     0.   ,     0.   , ...,     0.   ,     0.   ,
            0.   ],
       ...,
       [11408.973, 11405.412, 11401.73 , ..., 11418.619, 11415.594,
        11412.299],
       [11381.321, 11377.788, 11373.976, ..., 11390.786, 11387.77 ,
        11384.481],
       [11407.741, 11404.211, 11400.542, ..., 11417.302, 11414.309,
        11411.036]], dtype=float32)

dx_gx

(y, x_left)

float32

5695.537 5695.9785 ... 12526.91

array([[ 5695.537 ,  5695.9785,  5696.437 , ...,  5694.2666,  5694.679 ,
         5695.098 ],
       [ 5718.535 ,  5718.986 ,  5719.456 , ...,  5717.243 ,  5717.6606,
         5718.0933],
       [ 5741.6035,  5742.066 ,  5742.534 , ...,  5740.2803,  5740.71  ,
         5741.152 ],
       ...,
       [12529.459 , 12530.309 , 12531.156 , ..., 12524.362 , 12526.91  ,
        12526.91  ],
       [12529.459 , 12531.156 , 12532.007 , ..., 12526.062 , 12526.062 ,
        12528.609 ],
       [12529.459 , 12530.309 , 12531.156 , ..., 12524.362 , 12526.91  ,
        12526.91  ]], dtype=float32)

dy_gy

(y_right, x)

float32

0.0 0.0 ... 18432876.0 18439940.0

array([[0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00],
       ...,
       [1.1385813e+04, 1.1382185e+04, 1.1378479e+04, ..., 1.1395688e+04,
        1.1392450e+04, 1.1389262e+04],
       [1.1385195e+04, 1.1381589e+04, 1.1377897e+04, ..., 1.1395039e+04,
        1.1391802e+04, 1.1388636e+04],
       [1.8447280e+07, 1.8454894e+07, 1.8462782e+07, ..., 1.8426088e+07,
        1.8432876e+07, 1.8439940e+07]], dtype=float32)

dx_gxgy

(y_right, x_left)

float32

5695.537 5695.9785 ... 12526.91

array([[ 5695.537 ,  5695.9785,  5696.437 , ...,  5694.2666,  5694.679 ,
         5695.098 ],
       [ 5731.811 ,  5732.2734,  5732.743 , ...,  5730.488 ,  5730.917 ,
         5731.357 ],
       [ 5754.946 ,  5755.4146,  5755.897 , ...,  5753.5986,  5754.0376,
         5754.487 ],
       ...,
       [12529.459 , 12531.156 , 12532.007 , ..., 12524.362 , 12526.062 ,
        12528.609 ],
       [12529.459 , 12531.156 , 12532.007 , ..., 12524.362 , 12526.062 ,
        12528.609 ],
       [12528.609 , 12529.459 , 12531.156 , ..., 12523.513 , 12525.213 ,
        12526.91  ]], dtype=float32)

dy_gxgy

(y_right, x_left)

float32

2923.577 2926.2566 ... 18436744.0

array([[2.9235769e+03, 2.9262566e+03, 2.9289292e+03, ..., 2.9154836e+03,
        2.9181882e+03, 2.9208870e+03],
       [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00],
       ...,
       [1.1395071e+04, 1.1391674e+04, 1.1387852e+04, ..., 1.1404697e+04,
        1.1401685e+04, 1.1398397e+04],
       [1.1394457e+04, 1.1391073e+04, 1.1387259e+04, ..., 1.1404044e+04,
        1.1401039e+04, 1.1397759e+04],
       [1.8443960e+07, 1.8451454e+07, 1.8459222e+07, ..., 1.8423130e+07,
        1.8429802e+07, 1.8436744e+07]], dtype=float32)

Data variables: (1)

tos

(y, x, member_id, time)

float32

dask.array<chunksize=(1050, 1442, 1, 20), meta=np.ndarray>

cell_measures :: area: areacello
cell_methods :: area: mean where sea time: mean
description :: This may differ from "surface temperature" in regions of sea ice or floating ice shelves. For models using conservative temperature as the prognostic field, they should report the top ocean layer as surface potential temperature, which is the same as surface in situ temperature.
history :: none
interval_operation :: 900 s
interval_write :: 1 month
long_name :: Sea Surface Temperature
online_operation :: average
standard_name :: sea_surface_temperature
units :: degC





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.13 MB 
     Shape  (1050, 1442, 1, 1980)   (1050, 1442, 1, 20) 
     Count  397 Tasks  99 Chunks 
     Type  float32  numpy.ndarray

Attributes: (55)
CMIP6_CV_version :
cv=6.2.3.0-7-g2019642
Conventions :
CF-1.7 CMIP-6.2
EXPID :
CNRM-CM6-1-HR_historical_r1i1p1f2
activity_id :
CMIP
arpege_minor_version :
6.3.3
branch_method :
standard
branch_time_in_child :
0.0
branch_time_in_parent :
0.0
contact :
contact.cmip@meteo.fr
creation_date :
2019-05-23T13:08:54Z
data_specs_version :
01.00.21
description :
CMIP6 historical
dr2xml_md5sum :
45d4369d889ddfb8149d771d8625e9ec
dr2xml_version :
1.16
experiment :
all-forcing simulation of the recent past
experiment_id :
historical
external_variables :
areacello
forcing_index :
2
frequency :
mon
further_info_url :
https://furtherinfo.es-doc.org/CMIP6.CNRM-CERFACS.CNRM-CM6-1-HR.historical.none.r1i1p1f2
grid :
native ocean tri-polar grid with 1.47 M ocean cells
grid_label :
gn
history :
none
initialization_index :
1
institution :
CNRM (Centre National de Recherches Meteorologiques, Toulouse 31057, France), CERFACS (Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, Toulouse 31057, France)
institution_id :
CNRM-CERFACS
license :
CMIP6 model data produced by CNRM-CERFACS is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at http://www.umr-cnrm.fr/cmip6/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
mip_era :
CMIP6
nemo_gelato_commit :
84a9e3f161dade7_8250e198106a168
nominal_resolution :
25 km
parent_activity_id :
C M I P
parent_experiment_id :
p i C o n t r o l
parent_mip_era :
CMIP6
parent_source_id :
CNRM-CM6-1-HR
parent_time_units :
days since 1850-01-01 00:00:00
parent_variant_label :
r1i1p1f2
physics_index :
1
product :
model-output
realization_index :
1
realm :
ocean
references :
http://www.umr-cnrm.fr/cmip6/references
source :
CNRM-CM6-1-HR (2017): aerosol: prescribed monthly fields computed by TACTIC_v2 scheme atmos: Arpege 6.3 (T359; Gaussian Reduced with 181724 grid points in total distributed over 360 latitude circles (with 720 grid points per latitude circle between 32.2degN and 32.2degS reducing to 18 grid points per latitude circle at 89.6degN and 89.6degS); 91 levels; top level 78.4 km) atmosChem: OZL_v2 land: Surfex 8.0c ocean: Nemo 3.6 (eORCA025, tripolar primarily 1/4deg; 1442 x 1050 longitude/latitude; 75 levels; top grid cell 0-1 m) seaIce: Gelato 6.1
source_id :
CNRM-CM6-1-HR
source_type :
AOGCM
status :
2020-05-09;created; by gcs.cmip6.ldeo@gmail.com
sub_experiment :
none
sub_experiment_id :
none
table_id :
Omon
title :
CNRM-CM6-1-HR model output prepared for CMIP6 and CMIP / historical simulation
tracking_id :
hdl:21.14100/ea182917-b175-4bbd-bb83-f127d71fa772
variable_id :
tos
variant_label :
r1i1p1f2
xios_commit :
1442-shuffle
intake_esm_varname :
tos
intake_esm_dataset_key :
CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.historical.Omon.gn

Now there are additional coordinates, representing distances in the x and y direction at different grid locations. We can now calculate the derivative with respect to x.

In [19]:

dt_dx = grid.diff(ds.tos, 'X') / ds.dx_gx
dt_dx

Out[19]:

xarray.DataArray

y: 1050
x_left: 1442
member_id: 1
time: 1980

dask.array<chunksize=(1050, 1, 1, 20), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.04 MB 
     Shape  (1050, 1442, 1, 1980)   (1050, 1441, 1, 20) 
     Count  1789 Tasks  198 Chunks 
     Type  float32  numpy.ndarray

Coordinates: (6)

Attributes: (0)

xgcm can handle these cases even smarter, by automatically picking the right grid metric, in this case the zonal distance.

In [20]:

dt_dx_auto = grid.derivative(ds.tos, 'X')
dt_dx_auto

Out[20]:

xarray.DataArray

y: 1050
x_left: 1442
member_id: 1
time: 1980

dask.array<chunksize=(1050, 1, 1, 20), meta=np.ndarray>





  
      Array  Chunk 
  
  
     Bytes  11.99 GB   121.04 MB 
     Shape  (1050, 1442, 1, 1980)   (1050, 1441, 1, 20) 
     Count  1789 Tasks  198 Chunks 
     Type  float32  numpy.ndarray

Coordinates: (4)
- y
  (y)
  float32
  -78.107124 -78.05977 ... 90.0
  axis :
  Y
```
array([-78.107124, -78.05977 , -78.01226 , ...,  89.88762 ,  89.94787 ,
        90.      ], dtype=float32)
```
- x_left
  (x_left)
  float32
  0.125 0.375 ... 359.625 359.875
  axis :
  X
  c_grid_axis_shift :
  -0.5
```
array([1.25000e-01, 3.75000e-01, 6.25000e-01, ..., 3.59375e+02, 3.59625e+02,
       3.59875e+02], dtype=float32)
```
- member_id
  (member_id)
  <U8
  'r1i1p1f2'
```
array(['r1i1p1f2'], dtype='<U8')
```
- time
  (time)
  int64
  0 708 1416 ... 1444884 1445616
  axis :
  T
  bounds :
  time_bounds
  calendar :
  gregorian
  long_name :
  Time axis
  standard_name :
  time
  time_origin :
  1850-01-01 00:00:00
  units :
  hours since 1850-01-16 12:00:00.000000
```
array([      0,     708,    1416, ..., 1444152, 1444884, 1445616])
```
Attributes: (0)

In [21]:

fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=[13,5])
dt_dx.isel(time=0).sel(x_left=slice(120, 140), y=slice(20,40)).plot(ax=ax1, vmax=1e-4)
dt_dx_auto.isel(time=0).sel(x_left=slice(120, 140), y=slice(20,40)).plot(ax=ax2, vmax=1e-4)

Out[21]:

<matplotlib.collections.QuadMesh at 0x7f90718e3208>

The results are identical, and the user does not have to remember the names of the distance coordinates, xgcm takes care of it. The derivative in the y-direction can be calculated similarly.

Then, let's compute the full gradient amplitude for all the models in one loop, averaging the values over all members and 5 years at the beginning of the historical period.

In [22]:

fig, axarr = plt.subplots(
    ncols=4, nrows=7, figsize=[15, 15], subplot_kw={"projection": ccrs.Robinson(190)}
)
for ax, (k, ds) in zip(axarr.flat, ddict_pp.items()):
    
    # the cmip6_preprocessing magic: 
    # create an xgcm grid object and dataset with reconstructed grid metrics
    grid, ds = combine_staggered_grid(ds, recalculate_metrics=True)
    
    da = ds.tos

    # some models have large values instead of nans, so we mask unresonable values
    da = da.where(da < 1e30)

    # calculate the zonal temperature gradient
    dt_dx = grid.derivative(da, 'X')
    dt_dy = grid.derivative(da, 'Y', boundary='extend')
    # these values are now situated on the cell faces, we need to 
    # interpolate them back to the center to combine them

    dt_dx = grid.interp(dt_dx, 'X')
    dt_dy = grid.interp(dt_dy, 'Y', boundary='extend')
    grad = (dt_dx**2 + dt_dy**2)**0.5

    ds['grad'] = grad

    # take an average over the first 5 years of the run
    ds = ds.isel(time=slice(0,12*5)).mean('time', keep_attrs=True)

    # take the average over all available model members
    if "member_id" in ds.dims:
        ds = ds.mean('member_id', keep_attrs=True)

    # and plot the resulting 2d array using xarray
    pp = ds.grad.plot(
        ax=ax,
        x="lon",
        y="lat",
        vmin=1e-9,
        vmax=1e-5,
        transform=ccrs.PlateCarree(),
        infer_intervals=False,
        add_colorbar=False,
    )
    ax.set_title(ds.attrs['source_id'])
    ax.add_feature(cfeature.LAND, facecolor='0.6')
    ax.coastlines()
    
fig.subplots_adjust(hspace=0.05, wspace=0.05)
fig.subplots_adjust(right=0.9)
cbar_ax = fig.add_axes([0.92, 0.3, 0.015, 0.4])
fig.colorbar(pp, cax=cbar_ax, label="Sea Surface Temperature gradient magnitude [$^\circ C/m$]")

Out[22]:

<matplotlib.colorbar.Colorbar at 0x7f907a5a0be0>

Not bad for five additional lines of code!

Features like the western boundary currents and the fronts across along the Antarctic Circumpolar Current and the equatorial upwelling zones are clearly visible and present in most models.

Calculate surface vorticity in the North Atlantic¶

But this was still just operating on a tracer field. What about velocity fields? We can also combine different variables into an xgcm compatible dataset!

The recommended workflow is to first load datasets seperately for each desired variable:

In [23]:

variables = ['thetao', 'uo', 'vo']
ddict_multi = {var:{} for var in variables}
for var in variables:
    cat_var = col.search(table_id='Omon',
                         grid_label='gn',
                         experiment_id='historical',
                         variable_id=var,
                         source_id=models)

    ddict_multi[var] = cat_var.to_dataset_dict(
        zarr_kwargs={'consolidated':True, 'decode_times':False},
        preprocess=combined_preprocessing
    )

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
MIROC-ES2L: No units found
CESM2-WACCM-FV2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`

100.00% [27/27 00:50<00:00]

CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
CESM2-WACCM-FV2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`

100.00% [27/27 00:52<00:00]

CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units foundMIROC-ES2L: No units found

CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
CESM2-WACCM-FV2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`

100.00% [27/27 00:50<00:00]

CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC-ES2L: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
CESM2: Unexpected unit (centimeters) for coordinate `lev` detected.
	 Converted to `m`
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found
MIROC6: No units found

Now we need to make sure to choose only the intersection between the keys of each sub-dictionary, because we need all 3 variables for each model.

In [24]:

ddict_multi_clean = {var:{} for var in variables}
for k in ddict_multi['thetao']:
    if all([k in ddict_multi[var] for var in variables]) and 'lev' in ddict_multi['thetao'][k].dims:
        for var in variables:
            ddict_multi_clean[var][k] = ddict_multi[var][k]

And now similar as before we can create a complete grid dataset, but we can pass additional datasets to be combined using the other_ds argument. This can be a list of different variables, cmip6_preprocessing automatically detects the appropriate grid position.

In [25]:

import matplotlib.path as mpath
import numpy as np
from xgcm import Grid
import xarray as xr

fig, axarr = plt.subplots(
    ncols=4, nrows=6, figsize=[25, 20], subplot_kw={"projection": ccrs.PlateCarree()}
)
for ai ,(ax, k) in enumerate(zip(axarr.flat, ddict_multi_clean['thetao'].keys())):
    ds_t = ddict_multi_clean['thetao'][k] # One tracer should always be the reference dataset!
    ds_u = ddict_multi_clean['uo'][k]
    ds_v = ddict_multi_clean['vo'][k]

    if 'lev' in ds_t: # show only models with depth coordinates in the vertical
        # aling only the intersection of the member_ids
        ds_t, ds_u, ds_v = xr.align(ds_t, ds_u, ds_v, join='inner', exclude=[di for di in ds_t.dims if di != 'member_id'])

        # combine all datasets and create grid with metrics
        grid, ds = combine_staggered_grid(ds_t, other_ds=[ds_u, ds_v], recalculate_metrics=True)


        # interpolate grid metrics at the corner points (these are not *yet* constructed by cmip6_preprocessing)
        ds.coords['dx_temp'] = grid.interp(ds['dx_gx'], 'Y', boundary='extend')
        ds.coords['dy_temp'] = grid.interp(ds['dy_gy'], 'X')
        
        ds = ds.chunk({'x':360})
        
        # selecting the surface value
        ds = ds.isel(lev=0)
        
        # For demonstration purposes select the first member and a single monthly timestep
        ds = ds.isel(time=10)
        
        if "member_id" in ds.dims:
            ds = ds.isel(member_id=0)

        grid = Grid(ds, periodic=['X'], metrics={'X':[co for co in ds.coords if 'dx' in co],
                                                'Y':[co for co in ds.coords if 'dy' in co]})


        dv_dx = grid.derivative(ds.vo, 'X')
        du_dy = grid.derivative(ds.uo, 'Y',boundary='extend')

        # check the position of the derivatives and interpolate back to tracer point for plotting
        if any(['x_' in di for di in dv_dx.dims]):
            dv_dx = grid.interp(dv_dx, 'X')
        if any(['y_' in di for di in dv_dx.dims]):
            dv_dx = grid.interp(dv_dx, 'Y', boundary='extend')

        if any(['x_' in di for di in du_dy.dims]):
            du_dy = grid.interp(du_dy, 'X')
        if any(['y_' in di for di in du_dy.dims]):
            du_dy = grid.interp(du_dy, 'Y', boundary='extend')

        curl = dv_dx - du_dy

        ds['curl'] = curl
        
        # Select the North Atlantic
        ds = ds.sel(x=slice(270,335), y=slice(10,55))

        # and plot the resulting 2d array using xarray
        pp = ds.curl.plot.contourf(
            levels=51,
            ax=ax,
            x="lon",
            y="lat",
            vmax=1e-5,
            transform=ccrs.PlateCarree(),
            infer_intervals=False,
            add_colorbar=False,
            add_labels=False
        )
        ax.text(0.2,0.9,ds.attrs['source_id'],horizontalalignment='center',verticalalignment='center',
                transform=ax.transAxes, fontsize=10)
        ax.coastlines()
        ax.set_extent([-90, -45, 20, 45], ccrs.PlateCarree())
fig.subplots_adjust(wspace=0.01, hspace=0.01)
fig.subplots_adjust(right=0.9)
cbar_ax = fig.add_axes([0.92, 0.3, 0.015, 0.4])
fig.colorbar(pp, cax=cbar_ax, label="Sea Surface Vorticity [$1/s$]")

Out[25]:

<matplotlib.colorbar.Colorbar at 0x7f8ef549ee10>

A caveat: Reconstructed metrics¶

For all the examples shown here, we have relied on a simplified reconstruction of the grid metrics (in this case, the distances between different points on the grid). We can check the quality of the reconstruction indirectly by comparing these to the only horizontal grid metric that is saved with the ocean model output: The horizontal grid cell area.

We reconstruct our area naively as the product of the x and y distances centered on the tracer cell, dx_t, and dy_t.

In [26]:

ds = ddict_multi_clean['thetao']['CMIP.NOAA-GFDL.GFDL-CM4.historical.Omon.gn']
_,ds = combine_staggered_grid(ds, recalculate_metrics=True)
area_reconstructed = ds.dx_t * ds.dy_t

Now lets load the actual area and compare the two.

In [27]:

url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
cat = col.search(table_id='Ofx', variable_id='areacello', source_id='GFDL-CM4' , grid_label='gn') # switch to `grid_label='gr'` for regridded file
ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True}, preprocess=combined_preprocessing)
_,ds_area = ddict.popitem()
area = ds_area.areacello.isel(member_id=0).squeeze()

--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'

100.00% [2/2 00:00<00:00]

In [28]:

fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=[20,5])
area_reconstructed.plot(ax=ax1, vmax=1e9)
area.plot(ax=ax2, vmax=1e9)
((area_reconstructed - area) / area * 100).plot(ax=ax3, vmax=0.5)
ax1.set_title('reconstructed area')
ax2.set_title('saved area')
ax3.set_title('difference in %')

Out[28]:

Text(0.5, 1.0, 'difference in %')

You can see that for this particular model (GFDL-CM4), the reconstruction does reproduce the grid area with a less than 0.5% error between approximately 60S-60N. North of that, the grid geometry gets vastly more complicated, and the simple reconstruction fails. The area over which the reconstruction varies from model to model and as such caution should always be exercised when analyzing data using reconstructed metrics.

At this point, this is, however, the only way to run these kinds of analyses, since the original grid metrics are not routinely provided with the CMIP6 output. In order to truly reproduce analyses like e.g., budgets, the community requires the native model geometry.

Conclusions¶

We show that using cmip6_preprocessing and xgcm users can analyze complex native ocean model grids without the need to interpolate data or keep track of the intricacies of single model grids. Using these tools already enables users to calculate common operators like the gradient and curl, weighted averages and more, in a 'grid-agnostic' way, with decent precision outside of the polar regions.

Pending on the availability of more grid metric output and building on these tools, complex analyses like various budgets could become a matter of a few lines and be calculated across all models in the CMIP6 archive.

By combining generalizable and reproducible analysis with the publicly available CMIP6 data, more users will be able to analyze the data efficiently, leading to faster understanding and synthesis of the vast amount of data provided by modern climate modeling.

Acknowledgements¶

Julius Busecke's contributions to this work are motivated and developed as part of an ongoing project together with his postdoc advisor Prof. Laure Resplandy. This project investigates the influence of equatorial dynamics on possible expansions of Oxygen Minimum Zones in the Pacific and is funded under the NOAA Cooperative Institute for Climate Science agreement NA14OAR4320106.

CMIP6 without the interpolation: Grid-native analysis with Pangeo in the cloud¶

Client

Cluster

Loading the CMIP6 data from the cloud¶

Use cmip6_preprocessing to harmonize different model output¶

Calculations on native model grids using xgcm.¶

Calculate surface vorticity in the North Atlantic¶

A caveat: Reconstructed metrics¶

Conclusions¶

Acknowledgements¶

Use `cmip6_preprocessing` to harmonize different model output¶

Calculations on native model grids using `xgcm`.¶