Percentile climatology with a window¶

We'd like to caluclate a 90th percentile climatology of a dataset, with the added complexity that rather than just grouping all the Jan 1st of each year together we want to add a window of 15 days either side, so that the Jan 1st result is the 90th percentile of all the Dec 16th to Jan 16th of all years.

In [1]:

# Imports
import xarray

In [2]:

# Start a dask client
import climtas.nci
climtas.nci.Client()

Out[2]:

Client

Client-982468aa-0f71-11ec-89ad-fa163eaaaec0

Connection method: Cluster object	Cluster type: distributed.LocalCluster
Dashboard: /node/ood-vn8/17103/proxy/8787/status

Cluster Info

LocalCluster

3fc6f855

Dashboard: /node/ood-vn8/17103/proxy/8787/status	Workers: 4
Total threads: 4	Total memory: 11.23 GiB
Status: running	Using processes: True

Scheduler Info

Scheduler

Scheduler-1357819f-28e0-4388-8de3-a2cf2fa7def9

Comm: tcp://127.0.0.1:44551	Workers: 4
Dashboard: /node/ood-vn8/17103/proxy/8787/status	Total threads: 4
Started: Just now	Total memory: 11.23 GiB

Workers

Worker: 0

Comm: tcp://10.0.128.136:40323	Total threads: 1
Dashboard: /node/ood-vn8/17103/proxy/33999/status	Memory: 2.81 GiB
Nanny: tcp://127.0.0.1:45285
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-gguw326_

Worker: 1

Comm: tcp://10.0.128.136:35241	Total threads: 1
Dashboard: /node/ood-vn8/17103/proxy/45751/status	Memory: 2.81 GiB
Nanny: tcp://127.0.0.1:43429
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-19ps26cb

Worker: 2

Comm: tcp://10.0.128.136:36907	Total threads: 1
Dashboard: /node/ood-vn8/17103/proxy/46587/status	Memory: 2.81 GiB
Nanny: tcp://127.0.0.1:45357
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-gftvk908

Worker: 3

Comm: tcp://10.0.128.136:41941	Total threads: 1
Dashboard: /node/ood-vn8/17103/proxy/45603/status	Memory: 2.81 GiB
Nanny: tcp://127.0.0.1:46317
Local directory: /local/w35/saw562/tmp/dask-worker-space/worker-c0pfsb5n

Sample data¶

For the demo I'll use ACCESS CMIP6 surface temperature. I've chosen a chunking in time as that's usually how the data is laid out in the file - you have the whole grid for Jan 1st, then the whole grid for Jan 2nd and so on. It's faster to read data if it's close together in the file, so we don't want to hop all over getting all the times for a single grid point.

The chunk size is aiming to get the chunk bytes in the array table to very roughly around 100 mb - 30 mb is fine for this (I chose 300 to start with because that's roughly a year)

In [3]:

ds = xarray.open_mfdataset('/g/data/fs38/publications/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/amip/r1i1p1f1/day/tas/gn/latest/*.nc', chunks={'time': 300})
ds.tas

Out[3]:

<xarray.DataArray 'tas' (time: 13149, lat: 145, lon: 192)>
dask.array<open_dataset-de3bca054c2ed43774883de8240854a6tas, shape=(13149, 145, 192), dtype=float32, chunksize=(300, 145, 192), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-12-31T12:00:00
  * lat      (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
  * lon      (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
    height   float64 ...
Attributes:
    standard_name:  air_temperature
    long_name:      Near-Surface Air Temperature
    comment:        near-surface (usually, 2 meter) air temperature
    units:          K
    cell_methods:   area: time: mean
    cell_measures:  area: areacella
    history:        2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...

xarray.DataArray

'tas'

time: 13149
lat: 145
lon: 192

dask.array<chunksize=(300, 145, 192), meta=np.ndarray>

	Array	Chunk
Bytes	1.36 GiB	31.86 MiB
Shape	(13149, 145, 192)	(300, 145, 192)
Count	45 Tasks	44 Chunks
Type	float32	numpy.ndarray

Coordinates: (4)

time

(time)

datetime64[ns]

1979-01-01T12:00:00 ... 2014-12-...

bounds :: time_bnds
axis :: T
long_name :: time
standard_name :: time

array(['1979-01-01T12:00:00.000000000', '1979-01-02T12:00:00.000000000',
       '1979-01-03T12:00:00.000000000', ..., '2014-12-29T12:00:00.000000000',
       '2014-12-30T12:00:00.000000000', '2014-12-31T12:00:00.000000000'],
      dtype='datetime64[ns]')

lat

(lat)

float64

-90.0 -88.75 -87.5 ... 88.75 90.0

bounds :: lat_bnds
units :: degrees_north
axis :: Y
long_name :: Latitude
standard_name :: latitude

array([-90.  , -88.75, -87.5 , -86.25, -85.  , -83.75, -82.5 , -81.25, -80.  ,
       -78.75, -77.5 , -76.25, -75.  , -73.75, -72.5 , -71.25, -70.  , -68.75,
       -67.5 , -66.25, -65.  , -63.75, -62.5 , -61.25, -60.  , -58.75, -57.5 ,
       -56.25, -55.  , -53.75, -52.5 , -51.25, -50.  , -48.75, -47.5 , -46.25,
       -45.  , -43.75, -42.5 , -41.25, -40.  , -38.75, -37.5 , -36.25, -35.  ,
       -33.75, -32.5 , -31.25, -30.  , -28.75, -27.5 , -26.25, -25.  , -23.75,
       -22.5 , -21.25, -20.  , -18.75, -17.5 , -16.25, -15.  , -13.75, -12.5 ,
       -11.25, -10.  ,  -8.75,  -7.5 ,  -6.25,  -5.  ,  -3.75,  -2.5 ,  -1.25,
         0.  ,   1.25,   2.5 ,   3.75,   5.  ,   6.25,   7.5 ,   8.75,  10.  ,
        11.25,  12.5 ,  13.75,  15.  ,  16.25,  17.5 ,  18.75,  20.  ,  21.25,
        22.5 ,  23.75,  25.  ,  26.25,  27.5 ,  28.75,  30.  ,  31.25,  32.5 ,
        33.75,  35.  ,  36.25,  37.5 ,  38.75,  40.  ,  41.25,  42.5 ,  43.75,
        45.  ,  46.25,  47.5 ,  48.75,  50.  ,  51.25,  52.5 ,  53.75,  55.  ,
        56.25,  57.5 ,  58.75,  60.  ,  61.25,  62.5 ,  63.75,  65.  ,  66.25,
        67.5 ,  68.75,  70.  ,  71.25,  72.5 ,  73.75,  75.  ,  76.25,  77.5 ,
        78.75,  80.  ,  81.25,  82.5 ,  83.75,  85.  ,  86.25,  87.5 ,  88.75,
        90.  ])

lon

(lon)

float64

0.0 1.875 3.75 ... 356.2 358.1

bounds :: lon_bnds
units :: degrees_east
axis :: X
long_name :: Longitude
standard_name :: longitude

array([  0.   ,   1.875,   3.75 ,   5.625,   7.5  ,   9.375,  11.25 ,  13.125,
        15.   ,  16.875,  18.75 ,  20.625,  22.5  ,  24.375,  26.25 ,  28.125,
        30.   ,  31.875,  33.75 ,  35.625,  37.5  ,  39.375,  41.25 ,  43.125,
        45.   ,  46.875,  48.75 ,  50.625,  52.5  ,  54.375,  56.25 ,  58.125,
        60.   ,  61.875,  63.75 ,  65.625,  67.5  ,  69.375,  71.25 ,  73.125,
        75.   ,  76.875,  78.75 ,  80.625,  82.5  ,  84.375,  86.25 ,  88.125,
        90.   ,  91.875,  93.75 ,  95.625,  97.5  ,  99.375, 101.25 , 103.125,
       105.   , 106.875, 108.75 , 110.625, 112.5  , 114.375, 116.25 , 118.125,
       120.   , 121.875, 123.75 , 125.625, 127.5  , 129.375, 131.25 , 133.125,
       135.   , 136.875, 138.75 , 140.625, 142.5  , 144.375, 146.25 , 148.125,
       150.   , 151.875, 153.75 , 155.625, 157.5  , 159.375, 161.25 , 163.125,
       165.   , 166.875, 168.75 , 170.625, 172.5  , 174.375, 176.25 , 178.125,
       180.   , 181.875, 183.75 , 185.625, 187.5  , 189.375, 191.25 , 193.125,
       195.   , 196.875, 198.75 , 200.625, 202.5  , 204.375, 206.25 , 208.125,
       210.   , 211.875, 213.75 , 215.625, 217.5  , 219.375, 221.25 , 223.125,
       225.   , 226.875, 228.75 , 230.625, 232.5  , 234.375, 236.25 , 238.125,
       240.   , 241.875, 243.75 , 245.625, 247.5  , 249.375, 251.25 , 253.125,
       255.   , 256.875, 258.75 , 260.625, 262.5  , 264.375, 266.25 , 268.125,
       270.   , 271.875, 273.75 , 275.625, 277.5  , 279.375, 281.25 , 283.125,
       285.   , 286.875, 288.75 , 290.625, 292.5  , 294.375, 296.25 , 298.125,
       300.   , 301.875, 303.75 , 305.625, 307.5  , 309.375, 311.25 , 313.125,
       315.   , 316.875, 318.75 , 320.625, 322.5  , 324.375, 326.25 , 328.125,
       330.   , 331.875, 333.75 , 335.625, 337.5  , 339.375, 341.25 , 343.125,
       345.   , 346.875, 348.75 , 350.625, 352.5  , 354.375, 356.25 , 358.125])

height
()
float64
...
units :
m
axis :
Z
positive :
up
long_name :
height
standard_name :
height
```
array(2.)
```

Attributes: (7)
standard_name :
air_temperature
long_name :
Near-Surface Air Temperature
comment :
near-surface (usually, 2 meter) air temperature
units :
K
cell_methods :
area: time: mean
cell_measures :
area: areacella
history :
2019-11-14T05:08:45Z altered by CMOR: Treated scalar dimension: 'height'. 2019-11-14T05:08:45Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20).

31 day window¶

We'll start out with the windowing, gathering together 15 days either side of each day. We can do this with '.rolling()'. We give it the total size of the window, here 31 for the day plus 15 days either side.

In [4]:

ds.tas.rolling(time=31, center=True)

Out[4]:

DataArrayRolling [time->31(center)]

if we liked we could get stats for each window just by calling '.mean()' etc. here. We're wanting to do something a bit more complex however with a climatology, so we'll use the construct() function. This rearranges the array to add a new axis that contains all of the samples for the window

In [5]:

ds.tas.rolling(time=31, center=True).construct(time='window')

Out[5]:

<xarray.DataArray 'tas' (time: 13149, lat: 145, lon: 192, window: 31)>
dask.array<sliding_window_view, shape=(13149, 145, 192, 31), dtype=float32, chunksize=(315, 145, 192, 31), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-12-31T12:00:00
  * lat      (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
  * lon      (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
    height   float64 ...
Dimensions without coordinates: window
Attributes:
    standard_name:  air_temperature
    long_name:      Near-Surface Air Temperature
    comment:        near-surface (usually, 2 meter) air temperature
    units:          K
    cell_methods:   area: time: mean
    cell_measures:  area: areacella
    history:        2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...

xarray.DataArray

'tas'

time: 13149
lat: 145
lon: 192
window: 31

dask.array<chunksize=(315, 145, 192, 31), meta=np.ndarray>

	Array	Chunk
Bytes	42.28 GiB	1.01 GiB
Shape	(13149, 145, 192, 31)	(315, 145, 192, 31)
Count	363 Tasks	45 Chunks
Type	float32	numpy.ndarray

Coordinates: (4)

time

(time)

datetime64[ns]

1979-01-01T12:00:00 ... 2014-12-...

bounds :: time_bnds
axis :: T
long_name :: time
standard_name :: time

array(['1979-01-01T12:00:00.000000000', '1979-01-02T12:00:00.000000000',
       '1979-01-03T12:00:00.000000000', ..., '2014-12-29T12:00:00.000000000',
       '2014-12-30T12:00:00.000000000', '2014-12-31T12:00:00.000000000'],
      dtype='datetime64[ns]')

lat

(lat)

float64

-90.0 -88.75 -87.5 ... 88.75 90.0

bounds :: lat_bnds
units :: degrees_north
axis :: Y
long_name :: Latitude
standard_name :: latitude

array([-90.  , -88.75, -87.5 , -86.25, -85.  , -83.75, -82.5 , -81.25, -80.  ,
       -78.75, -77.5 , -76.25, -75.  , -73.75, -72.5 , -71.25, -70.  , -68.75,
       -67.5 , -66.25, -65.  , -63.75, -62.5 , -61.25, -60.  , -58.75, -57.5 ,
       -56.25, -55.  , -53.75, -52.5 , -51.25, -50.  , -48.75, -47.5 , -46.25,
       -45.  , -43.75, -42.5 , -41.25, -40.  , -38.75, -37.5 , -36.25, -35.  ,
       -33.75, -32.5 , -31.25, -30.  , -28.75, -27.5 , -26.25, -25.  , -23.75,
       -22.5 , -21.25, -20.  , -18.75, -17.5 , -16.25, -15.  , -13.75, -12.5 ,
       -11.25, -10.  ,  -8.75,  -7.5 ,  -6.25,  -5.  ,  -3.75,  -2.5 ,  -1.25,
         0.  ,   1.25,   2.5 ,   3.75,   5.  ,   6.25,   7.5 ,   8.75,  10.  ,
        11.25,  12.5 ,  13.75,  15.  ,  16.25,  17.5 ,  18.75,  20.  ,  21.25,
        22.5 ,  23.75,  25.  ,  26.25,  27.5 ,  28.75,  30.  ,  31.25,  32.5 ,
        33.75,  35.  ,  36.25,  37.5 ,  38.75,  40.  ,  41.25,  42.5 ,  43.75,
        45.  ,  46.25,  47.5 ,  48.75,  50.  ,  51.25,  52.5 ,  53.75,  55.  ,
        56.25,  57.5 ,  58.75,  60.  ,  61.25,  62.5 ,  63.75,  65.  ,  66.25,
        67.5 ,  68.75,  70.  ,  71.25,  72.5 ,  73.75,  75.  ,  76.25,  77.5 ,
        78.75,  80.  ,  81.25,  82.5 ,  83.75,  85.  ,  86.25,  87.5 ,  88.75,
        90.  ])

lon

(lon)

float64

0.0 1.875 3.75 ... 356.2 358.1

bounds :: lon_bnds
units :: degrees_east
axis :: X
long_name :: Longitude
standard_name :: longitude

array([  0.   ,   1.875,   3.75 ,   5.625,   7.5  ,   9.375,  11.25 ,  13.125,
        15.   ,  16.875,  18.75 ,  20.625,  22.5  ,  24.375,  26.25 ,  28.125,
        30.   ,  31.875,  33.75 ,  35.625,  37.5  ,  39.375,  41.25 ,  43.125,
        45.   ,  46.875,  48.75 ,  50.625,  52.5  ,  54.375,  56.25 ,  58.125,
        60.   ,  61.875,  63.75 ,  65.625,  67.5  ,  69.375,  71.25 ,  73.125,
        75.   ,  76.875,  78.75 ,  80.625,  82.5  ,  84.375,  86.25 ,  88.125,
        90.   ,  91.875,  93.75 ,  95.625,  97.5  ,  99.375, 101.25 , 103.125,
       105.   , 106.875, 108.75 , 110.625, 112.5  , 114.375, 116.25 , 118.125,
       120.   , 121.875, 123.75 , 125.625, 127.5  , 129.375, 131.25 , 133.125,
       135.   , 136.875, 138.75 , 140.625, 142.5  , 144.375, 146.25 , 148.125,
       150.   , 151.875, 153.75 , 155.625, 157.5  , 159.375, 161.25 , 163.125,
       165.   , 166.875, 168.75 , 170.625, 172.5  , 174.375, 176.25 , 178.125,
       180.   , 181.875, 183.75 , 185.625, 187.5  , 189.375, 191.25 , 193.125,
       195.   , 196.875, 198.75 , 200.625, 202.5  , 204.375, 206.25 , 208.125,
       210.   , 211.875, 213.75 , 215.625, 217.5  , 219.375, 221.25 , 223.125,
       225.   , 226.875, 228.75 , 230.625, 232.5  , 234.375, 236.25 , 238.125,
       240.   , 241.875, 243.75 , 245.625, 247.5  , 249.375, 251.25 , 253.125,
       255.   , 256.875, 258.75 , 260.625, 262.5  , 264.375, 266.25 , 268.125,
       270.   , 271.875, 273.75 , 275.625, 277.5  , 279.375, 281.25 , 283.125,
       285.   , 286.875, 288.75 , 290.625, 292.5  , 294.375, 296.25 , 298.125,
       300.   , 301.875, 303.75 , 305.625, 307.5  , 309.375, 311.25 , 313.125,
       315.   , 316.875, 318.75 , 320.625, 322.5  , 324.375, 326.25 , 328.125,
       330.   , 331.875, 333.75 , 335.625, 337.5  , 339.375, 341.25 , 343.125,
       345.   , 346.875, 348.75 , 350.625, 352.5  , 354.375, 356.25 , 358.125])

height
()
float64
...
units :
m
axis :
Z
positive :
up
long_name :
height
standard_name :
height
```
array(2.)
```

Attributes: (7)
standard_name :
air_temperature
long_name :
Near-Surface Air Temperature
comment :
near-surface (usually, 2 meter) air temperature
units :
K
cell_methods :
area: time: mean
cell_measures :
area: areacella
history :
2019-11-14T05:08:45Z altered by CMOR: Treated scalar dimension: 'height'. 2019-11-14T05:08:45Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20).

See how at the top there is now a 'window' dimension of length 31? That's all of the samples in the window for that (time, lat, lon) location. Calling ds.tas.rolling(time=31, center=True).mean() is equivalent to ds.tas.rolling(time=31, center=True).construct(time='window').mean('window')

Let's save this as a variable

In [6]:

tas_window = ds.tas.rolling(time=31, center=True).construct(time='window')

Percentile climatology¶

To calculate the climatology in xarray we use .groupby('time.dayofyear'). This groups the days by number of days since the start of the year, so for leap years Feb 29th will be grouped together with Mar 1st of non leap-years.

In [7]:

tas_window.groupby('time.dayofyear')

Out[7]:

DataArrayGroupBy, grouped over 'dayofyear'
366 groups with labels 1, 2, 3, 4, 5, ..., 363, 364, 365, 366.

If we look at a single day of year here, you can see that we've got two axes we need to gather over. 'time' is now the 36 years in the dataset, and 'window' is the nearby days.

In [8]:

for doy, data in tas_window.groupby('time.dayofyear'):
    display(data)
    break

<xarray.DataArray 'tas' (time: 36, lat: 145, lon: 192, window: 31)>
dask.array<getitem, shape=(36, 145, 192, 31), dtype=float32, chunksize=(1, 145, 192, 31), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 1979-01-01T12:00:00 ... 2014-01-01T12:00:00
  * lat      (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
  * lon      (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
    height   float64 ...
Dimensions without coordinates: window
Attributes:
    standard_name:  air_temperature
    long_name:      Near-Surface Air Temperature
    comment:        near-surface (usually, 2 meter) air temperature
    units:          K
    cell_methods:   area: time: mean
    cell_measures:  area: areacella
    history:        2019-11-14T05:08:45Z altered by CMOR: Treated scalar dime...

xarray.DataArray

'tas'

time: 36
lat: 145
lon: 192
window: 31

dask.array<chunksize=(1, 145, 192, 31), meta=np.ndarray>

	Array	Chunk
Bytes	118.52 MiB	3.29 MiB
Shape	(36, 145, 192, 31)	(1, 145, 192, 31)
Count	399 Tasks	36 Chunks
Type	float32	numpy.ndarray

Coordinates: (4)

time

(time)

datetime64[ns]

1979-01-01T12:00:00 ... 2014-01-...

bounds :: time_bnds
axis :: T
long_name :: time
standard_name :: time

array(['1979-01-01T12:00:00.000000000', '1980-01-01T12:00:00.000000000',
       '1981-01-01T12:00:00.000000000', '1982-01-01T12:00:00.000000000',
       '1983-01-01T12:00:00.000000000', '1984-01-01T12:00:00.000000000',
       '1985-01-01T12:00:00.000000000', '1986-01-01T12:00:00.000000000',
       '1987-01-01T12:00:00.000000000', '1988-01-01T12:00:00.000000000',
       '1989-01-01T12:00:00.000000000', '1990-01-01T12:00:00.000000000',
       '1991-01-01T12:00:00.000000000', '1992-01-01T12:00:00.000000000',
       '1993-01-01T12:00:00.000000000', '1994-01-01T12:00:00.000000000',
       '1995-01-01T12:00:00.000000000', '1996-01-01T12:00:00.000000000',
       '1997-01-01T12:00:00.000000000', '1998-01-01T12:00:00.000000000',
       '1999-01-01T12:00:00.000000000', '2000-01-01T12:00:00.000000000',
       '2001-01-01T12:00:00.000000000', '2002-01-01T12:00:00.000000000',
       '2003-01-01T12:00:00.000000000', '2004-01-01T12:00:00.000000000',
       '2005-01-01T12:00:00.000000000', '2006-01-01T12:00:00.000000000',
       '2007-01-01T12:00:00.000000000', '2008-01-01T12:00:00.000000000',
       '2009-01-01T12:00:00.000000000', '2010-01-01T12:00:00.000000000',
       '2011-01-01T12:00:00.000000000', '2012-01-01T12:00:00.000000000',
       '2013-01-01T12:00:00.000000000', '2014-01-01T12:00:00.000000000'],
      dtype='datetime64[ns]')

lat

(lat)

float64

-90.0 -88.75 -87.5 ... 88.75 90.0

bounds :: lat_bnds
units :: degrees_north
axis :: Y
long_name :: Latitude
standard_name :: latitude

array([-90.  , -88.75, -87.5 , -86.25, -85.  , -83.75, -82.5 , -81.25, -80.  ,
       -78.75, -77.5 , -76.25, -75.  , -73.75, -72.5 , -71.25, -70.  , -68.75,
       -67.5 , -66.25, -65.  , -63.75, -62.5 , -61.25, -60.  , -58.75, -57.5 ,
       -56.25, -55.  , -53.75, -52.5 , -51.25, -50.  , -48.75, -47.5 , -46.25,
       -45.  , -43.75, -42.5 , -41.25, -40.  , -38.75, -37.5 , -36.25, -35.  ,
       -33.75, -32.5 , -31.25, -30.  , -28.75, -27.5 , -26.25, -25.  , -23.75,
       -22.5 , -21.25, -20.  , -18.75, -17.5 , -16.25, -15.  , -13.75, -12.5 ,
       -11.25, -10.  ,  -8.75,  -7.5 ,  -6.25,  -5.  ,  -3.75,  -2.5 ,  -1.25,
         0.  ,   1.25,   2.5 ,   3.75,   5.  ,   6.25,   7.5 ,   8.75,  10.  ,
        11.25,  12.5 ,  13.75,  15.  ,  16.25,  17.5 ,  18.75,  20.  ,  21.25,
        22.5 ,  23.75,  25.  ,  26.25,  27.5 ,  28.75,  30.  ,  31.25,  32.5 ,
        33.75,  35.  ,  36.25,  37.5 ,  38.75,  40.  ,  41.25,  42.5 ,  43.75,
        45.  ,  46.25,  47.5 ,  48.75,  50.  ,  51.25,  52.5 ,  53.75,  55.  ,
        56.25,  57.5 ,  58.75,  60.  ,  61.25,  62.5 ,  63.75,  65.  ,  66.25,
        67.5 ,  68.75,  70.  ,  71.25,  72.5 ,  73.75,  75.  ,  76.25,  77.5 ,
        78.75,  80.  ,  81.25,  82.5 ,  83.75,  85.  ,  86.25,  87.5 ,  88.75,
        90.  ])

lon

(lon)

float64

0.0 1.875 3.75 ... 356.2 358.1

bounds :: lon_bnds
units :: degrees_east
axis :: X
long_name :: Longitude
standard_name :: longitude

array([  0.   ,   1.875,   3.75 ,   5.625,   7.5  ,   9.375,  11.25 ,  13.125,
        15.   ,  16.875,  18.75 ,  20.625,  22.5  ,  24.375,  26.25 ,  28.125,
        30.   ,  31.875,  33.75 ,  35.625,  37.5  ,  39.375,  41.25 ,  43.125,
        45.   ,  46.875,  48.75 ,  50.625,  52.5  ,  54.375,  56.25 ,  58.125,
        60.   ,  61.875,  63.75 ,  65.625,  67.5  ,  69.375,  71.25 ,  73.125,
        75.   ,  76.875,  78.75 ,  80.625,  82.5  ,  84.375,  86.25 ,  88.125,
        90.   ,  91.875,  93.75 ,  95.625,  97.5  ,  99.375, 101.25 , 103.125,
       105.   , 106.875, 108.75 , 110.625, 112.5  , 114.375, 116.25 , 118.125,
       120.   , 121.875, 123.75 , 125.625, 127.5  , 129.375, 131.25 , 133.125,
       135.   , 136.875, 138.75 , 140.625, 142.5  , 144.375, 146.25 , 148.125,
       150.   , 151.875, 153.75 , 155.625, 157.5  , 159.375, 161.25 , 163.125,
       165.   , 166.875, 168.75 , 170.625, 172.5  , 174.375, 176.25 , 178.125,
       180.   , 181.875, 183.75 , 185.625, 187.5  , 189.375, 191.25 , 193.125,
       195.   , 196.875, 198.75 , 200.625, 202.5  , 204.375, 206.25 , 208.125,
       210.   , 211.875, 213.75 , 215.625, 217.5  , 219.375, 221.25 , 223.125,
       225.   , 226.875, 228.75 , 230.625, 232.5  , 234.375, 236.25 , 238.125,
       240.   , 241.875, 243.75 , 245.625, 247.5  , 249.375, 251.25 , 253.125,
       255.   , 256.875, 258.75 , 260.625, 262.5  , 264.375, 266.25 , 268.125,
       270.   , 271.875, 273.75 , 275.625, 277.5  , 279.375, 281.25 , 283.125,
       285.   , 286.875, 288.75 , 290.625, 292.5  , 294.375, 296.25 , 298.125,
       300.   , 301.875, 303.75 , 305.625, 307.5  , 309.375, 311.25 , 313.125,
       315.   , 316.875, 318.75 , 320.625, 322.5  , 324.375, 326.25 , 328.125,
       330.   , 331.875, 333.75 , 335.625, 337.5  , 339.375, 341.25 , 343.125,
       345.   , 346.875, 348.75 , 350.625, 352.5  , 354.375, 356.25 , 358.125])

height
()
float64
...
units :
m
axis :
Z
positive :
up
long_name :
height
standard_name :
height
```
array(2.)
```

Attributes: (7)
standard_name :
air_temperature
long_name :
Near-Surface Air Temperature
comment :
near-surface (usually, 2 meter) air temperature
units :
K
cell_methods :
area: time: mean
cell_measures :
area: areacella
history :
2019-11-14T05:08:45Z altered by CMOR: Treated scalar dimension: 'height'. 2019-11-14T05:08:45Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20).

If we just wanted the mean we could do

In [9]:

tas_window.groupby('time.dayofyear').mean(['time', 'window'])

Out[9]:

<xarray.DataArray 'tas' (dayofyear: 366, lat: 145, lon: 192)>
dask.array<stack, shape=(366, 145, 192), dtype=float32, chunksize=(1, 145, 192), chunktype=numpy.ndarray>
Coordinates:
  * lat        (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
  * lon        (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1
    height     float64 ...
  * dayofyear  (dayofyear) int64 1 2 3 4 5 6 7 8 ... 360 361 362 363 364 365 366

xarray.DataArray

'tas'

dayofyear: 366
lat: 145
lon: 192

dask.array<chunksize=(1, 145, 192), meta=np.ndarray>

	Array	Chunk
Bytes	38.87 MiB	108.75 kiB
Shape	(366, 145, 192)	(1, 145, 192)
Count	40908 Tasks	366 Chunks
Type	float32	numpy.ndarray

Coordinates: (4)

lat

(lat)

float64

-90.0 -88.75 -87.5 ... 88.75 90.0

bounds :: lat_bnds
units :: degrees_north
axis :: Y
long_name :: Latitude
standard_name :: latitude

array([-90.  , -88.75, -87.5 , -86.25, -85.  , -83.75, -82.5 , -81.25, -80.  ,
       -78.75, -77.5 , -76.25, -75.  , -73.75, -72.5 , -71.25, -70.  , -68.75,
       -67.5 , -66.25, -65.  , -63.75, -62.5 , -61.25, -60.  , -58.75, -57.5 ,
       -56.25, -55.  , -53.75, -52.5 , -51.25, -50.  , -48.75, -47.5 , -46.25,
       -45.  , -43.75, -42.5 , -41.25, -40.  , -38.75, -37.5 , -36.25, -35.  ,
       -33.75, -32.5 , -31.25, -30.  , -28.75, -27.5 , -26.25, -25.  , -23.75,
       -22.5 , -21.25, -20.  , -18.75, -17.5 , -16.25, -15.  , -13.75, -12.5 ,
       -11.25, -10.  ,  -8.75,  -7.5 ,  -6.25,  -5.  ,  -3.75,  -2.5 ,  -1.25,
         0.  ,   1.25,   2.5 ,   3.75,   5.  ,   6.25,   7.5 ,   8.75,  10.  ,
        11.25,  12.5 ,  13.75,  15.  ,  16.25,  17.5 ,  18.75,  20.  ,  21.25,
        22.5 ,  23.75,  25.  ,  26.25,  27.5 ,  28.75,  30.  ,  31.25,  32.5 ,
        33.75,  35.  ,  36.25,  37.5 ,  38.75,  40.  ,  41.25,  42.5 ,  43.75,
        45.  ,  46.25,  47.5 ,  48.75,  50.  ,  51.25,  52.5 ,  53.75,  55.  ,
        56.25,  57.5 ,  58.75,  60.  ,  61.25,  62.5 ,  63.75,  65.  ,  66.25,
        67.5 ,  68.75,  70.  ,  71.25,  72.5 ,  73.75,  75.  ,  76.25,  77.5 ,
        78.75,  80.  ,  81.25,  82.5 ,  83.75,  85.  ,  86.25,  87.5 ,  88.75,
        90.  ])

lon

(lon)

float64

0.0 1.875 3.75 ... 356.2 358.1

bounds :: lon_bnds
units :: degrees_east
axis :: X
long_name :: Longitude
standard_name :: longitude

array([  0.   ,   1.875,   3.75 ,   5.625,   7.5  ,   9.375,  11.25 ,  13.125,
        15.   ,  16.875,  18.75 ,  20.625,  22.5  ,  24.375,  26.25 ,  28.125,
        30.   ,  31.875,  33.75 ,  35.625,  37.5  ,  39.375,  41.25 ,  43.125,
        45.   ,  46.875,  48.75 ,  50.625,  52.5  ,  54.375,  56.25 ,  58.125,
        60.   ,  61.875,  63.75 ,  65.625,  67.5  ,  69.375,  71.25 ,  73.125,
        75.   ,  76.875,  78.75 ,  80.625,  82.5  ,  84.375,  86.25 ,  88.125,
        90.   ,  91.875,  93.75 ,  95.625,  97.5  ,  99.375, 101.25 , 103.125,
       105.   , 106.875, 108.75 , 110.625, 112.5  , 114.375, 116.25 , 118.125,
       120.   , 121.875, 123.75 , 125.625, 127.5  , 129.375, 131.25 , 133.125,
       135.   , 136.875, 138.75 , 140.625, 142.5  , 144.375, 146.25 , 148.125,
       150.   , 151.875, 153.75 , 155.625, 157.5  , 159.375, 161.25 , 163.125,
       165.   , 166.875, 168.75 , 170.625, 172.5  , 174.375, 176.25 , 178.125,
       180.   , 181.875, 183.75 , 185.625, 187.5  , 189.375, 191.25 , 193.125,
       195.   , 196.875, 198.75 , 200.625, 202.5  , 204.375, 206.25 , 208.125,
       210.   , 211.875, 213.75 , 215.625, 217.5  , 219.375, 221.25 , 223.125,
       225.   , 226.875, 228.75 , 230.625, 232.5  , 234.375, 236.25 , 238.125,
       240.   , 241.875, 243.75 , 245.625, 247.5  , 249.375, 251.25 , 253.125,
       255.   , 256.875, 258.75 , 260.625, 262.5  , 264.375, 266.25 , 268.125,
       270.   , 271.875, 273.75 , 275.625, 277.5  , 279.375, 281.25 , 283.125,
       285.   , 286.875, 288.75 , 290.625, 292.5  , 294.375, 296.25 , 298.125,
       300.   , 301.875, 303.75 , 305.625, 307.5  , 309.375, 311.25 , 313.125,
       315.   , 316.875, 318.75 , 320.625, 322.5  , 324.375, 326.25 , 328.125,
       330.   , 331.875, 333.75 , 335.625, 337.5  , 339.375, 341.25 , 343.125,
       345.   , 346.875, 348.75 , 350.625, 352.5  , 354.375, 356.25 , 358.125])

height
()
float64
...
units :
m
axis :
Z
positive :
up
long_name :
height
standard_name :
height
```
array(2.)
```
dayofyear
(dayofyear)
int64
1 2 3 4 5 6 ... 362 363 364 365 366
```
array([  1,   2,   3, ..., 364, 365, 366])
```

Attributes: (0)

If we try the same with quantile (xarray's version of percentile) however we get an error

In [10]:

try:
    tas_window.groupby('time.dayofyear').quantile(0.9, ['time', 'window'])
except Exception as e:
    print('ERROR:', e)

ERROR: dimension time on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., ``.chunk(dict(time=-1))``, or pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` but beware that this may significantly increase memory usage.

Percentiles are hard to calculate on large distributed datasets - you need to have the entire dataset in memory, sort the data, then find the value at whatever percent along the sorted data. We need to help out Dask by manually loading the data we need for each calculation, so we'll go back to a loop and call '.load()' at each day of year, storing the result in a list.

In [11]:

%%time

result = []

for doy, data in tas_window.groupby('time.dayofyear'):
    percentile = data.load().quantile(0.9, ['time', 'window'])
    
    result.append(percentile)

CPU times: user 52min 1s, sys: 5min 31s, total: 57min 32s
Wall time: 1h 18min 1s

Once that's finished the values in the list can be combined with xarray.concat

In [12]:

percentile_climatology = xarray.concat(result, dim='dayofyear')
percentile_climatology

Out[12]:

<xarray.DataArray 'tas' (dayofyear: 366, lat: 145, lon: 192)>
array([[[250.67286682, 250.67286682, 250.67286682, ..., 250.67286682,
         250.67286682, 250.67286682],
        [251.57595825, 251.54759216, 251.49716187, ..., 251.66184998,
         251.64952087, 251.61529541],
        [252.19511414, 252.02455139, 251.96104431, ..., 252.57232666,
         252.41696167, 252.32958984],
        ...,
        [255.79222107, 255.88395691, 256.19177246, ..., 255.33990479,
         255.59410095, 255.73428345],
        [254.41465759, 254.30558777, 254.33512878, ..., 254.36961365,
         254.33735657, 254.32498169],
        [252.45547485, 252.45547485, 252.45547485, ..., 252.45547485,
         252.45547485, 252.45547485]],

       [[250.6728241 , 250.6728241 , 250.6728241 , ..., 250.6728241 ,
         250.6728241 , 250.6728241 ],
        [251.59952393, 251.54962463, 251.53194733, ..., 251.68025208,
         251.64960327, 251.61642151],
        [252.20888824, 252.06634064, 251.96010895, ..., 252.58323059,
         252.44898682, 252.3343277 ],
...
        [255.82539368, 255.98866272, 256.29229736, ..., 255.67448425,
         255.68865967, 255.80763245],
        [254.47499084, 254.69308472, 254.74490356, ..., 254.39610291,
         254.50894165, 254.38670349],
        [252.46173096, 252.46173096, 252.46173096, ..., 252.46173096,
         252.46173096, 252.46173096]],

       [[251.06691589, 251.06691589, 251.06691589, ..., 251.06691589,
         251.06691589, 251.06691589],
        [251.23405151, 251.20076904, 251.19150391, ..., 251.38855896,
         251.34030457, 251.29734802],
        [251.88744507, 251.86525269, 251.8250824 , ..., 252.6361908 ,
         252.57929688, 252.17391968],
        ...,
        [254.95679626, 255.5537323 , 255.38959045, ..., 254.92901306,
         254.91663208, 254.88747559],
        [254.23662109, 254.21036072, 254.17580261, ..., 253.96891174,
         254.10254822, 254.2250061 ],
        [252.87672119, 252.87672119, 252.87672119, ..., 252.87672119,
         252.87672119, 252.87672119]]])
Coordinates:
  * lat       (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0
  * lon       (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1
    quantile  float64 0.9
Dimensions without coordinates: dayofyear

xarray.DataArray

'tas'

dayofyear: 366
lat: 145
lon: 192

250.7 250.7 250.7 250.7 250.7 250.7 ... 252.9 252.9 252.9 252.9 252.9

array([[[250.67286682, 250.67286682, 250.67286682, ..., 250.67286682,
         250.67286682, 250.67286682],
        [251.57595825, 251.54759216, 251.49716187, ..., 251.66184998,
         251.64952087, 251.61529541],
        [252.19511414, 252.02455139, 251.96104431, ..., 252.57232666,
         252.41696167, 252.32958984],
        ...,
        [255.79222107, 255.88395691, 256.19177246, ..., 255.33990479,
         255.59410095, 255.73428345],
        [254.41465759, 254.30558777, 254.33512878, ..., 254.36961365,
         254.33735657, 254.32498169],
        [252.45547485, 252.45547485, 252.45547485, ..., 252.45547485,
         252.45547485, 252.45547485]],

       [[250.6728241 , 250.6728241 , 250.6728241 , ..., 250.6728241 ,
         250.6728241 , 250.6728241 ],
        [251.59952393, 251.54962463, 251.53194733, ..., 251.68025208,
         251.64960327, 251.61642151],
        [252.20888824, 252.06634064, 251.96010895, ..., 252.58323059,
         252.44898682, 252.3343277 ],
...
        [255.82539368, 255.98866272, 256.29229736, ..., 255.67448425,
         255.68865967, 255.80763245],
        [254.47499084, 254.69308472, 254.74490356, ..., 254.39610291,
         254.50894165, 254.38670349],
        [252.46173096, 252.46173096, 252.46173096, ..., 252.46173096,
         252.46173096, 252.46173096]],

       [[251.06691589, 251.06691589, 251.06691589, ..., 251.06691589,
         251.06691589, 251.06691589],
        [251.23405151, 251.20076904, 251.19150391, ..., 251.38855896,
         251.34030457, 251.29734802],
        [251.88744507, 251.86525269, 251.8250824 , ..., 252.6361908 ,
         252.57929688, 252.17391968],
        ...,
        [254.95679626, 255.5537323 , 255.38959045, ..., 254.92901306,
         254.91663208, 254.88747559],
        [254.23662109, 254.21036072, 254.17580261, ..., 253.96891174,
         254.10254822, 254.2250061 ],
        [252.87672119, 252.87672119, 252.87672119, ..., 252.87672119,
         252.87672119, 252.87672119]]])

Coordinates: (3)

lat

(lat)

float64

-90.0 -88.75 -87.5 ... 88.75 90.0

bounds :: lat_bnds
units :: degrees_north
axis :: Y
long_name :: Latitude
standard_name :: latitude

array([-90.  , -88.75, -87.5 , -86.25, -85.  , -83.75, -82.5 , -81.25, -80.  ,
       -78.75, -77.5 , -76.25, -75.  , -73.75, -72.5 , -71.25, -70.  , -68.75,
       -67.5 , -66.25, -65.  , -63.75, -62.5 , -61.25, -60.  , -58.75, -57.5 ,
       -56.25, -55.  , -53.75, -52.5 , -51.25, -50.  , -48.75, -47.5 , -46.25,
       -45.  , -43.75, -42.5 , -41.25, -40.  , -38.75, -37.5 , -36.25, -35.  ,
       -33.75, -32.5 , -31.25, -30.  , -28.75, -27.5 , -26.25, -25.  , -23.75,
       -22.5 , -21.25, -20.  , -18.75, -17.5 , -16.25, -15.  , -13.75, -12.5 ,
       -11.25, -10.  ,  -8.75,  -7.5 ,  -6.25,  -5.  ,  -3.75,  -2.5 ,  -1.25,
         0.  ,   1.25,   2.5 ,   3.75,   5.  ,   6.25,   7.5 ,   8.75,  10.  ,
        11.25,  12.5 ,  13.75,  15.  ,  16.25,  17.5 ,  18.75,  20.  ,  21.25,
        22.5 ,  23.75,  25.  ,  26.25,  27.5 ,  28.75,  30.  ,  31.25,  32.5 ,
        33.75,  35.  ,  36.25,  37.5 ,  38.75,  40.  ,  41.25,  42.5 ,  43.75,
        45.  ,  46.25,  47.5 ,  48.75,  50.  ,  51.25,  52.5 ,  53.75,  55.  ,
        56.25,  57.5 ,  58.75,  60.  ,  61.25,  62.5 ,  63.75,  65.  ,  66.25,
        67.5 ,  68.75,  70.  ,  71.25,  72.5 ,  73.75,  75.  ,  76.25,  77.5 ,
        78.75,  80.  ,  81.25,  82.5 ,  83.75,  85.  ,  86.25,  87.5 ,  88.75,
        90.  ])

lon

(lon)

float64

0.0 1.875 3.75 ... 356.2 358.1

bounds :: lon_bnds
units :: degrees_east
axis :: X
long_name :: Longitude
standard_name :: longitude

array([  0.   ,   1.875,   3.75 ,   5.625,   7.5  ,   9.375,  11.25 ,  13.125,
        15.   ,  16.875,  18.75 ,  20.625,  22.5  ,  24.375,  26.25 ,  28.125,
        30.   ,  31.875,  33.75 ,  35.625,  37.5  ,  39.375,  41.25 ,  43.125,
        45.   ,  46.875,  48.75 ,  50.625,  52.5  ,  54.375,  56.25 ,  58.125,
        60.   ,  61.875,  63.75 ,  65.625,  67.5  ,  69.375,  71.25 ,  73.125,
        75.   ,  76.875,  78.75 ,  80.625,  82.5  ,  84.375,  86.25 ,  88.125,
        90.   ,  91.875,  93.75 ,  95.625,  97.5  ,  99.375, 101.25 , 103.125,
       105.   , 106.875, 108.75 , 110.625, 112.5  , 114.375, 116.25 , 118.125,
       120.   , 121.875, 123.75 , 125.625, 127.5  , 129.375, 131.25 , 133.125,
       135.   , 136.875, 138.75 , 140.625, 142.5  , 144.375, 146.25 , 148.125,
       150.   , 151.875, 153.75 , 155.625, 157.5  , 159.375, 161.25 , 163.125,
       165.   , 166.875, 168.75 , 170.625, 172.5  , 174.375, 176.25 , 178.125,
       180.   , 181.875, 183.75 , 185.625, 187.5  , 189.375, 191.25 , 193.125,
       195.   , 196.875, 198.75 , 200.625, 202.5  , 204.375, 206.25 , 208.125,
       210.   , 211.875, 213.75 , 215.625, 217.5  , 219.375, 221.25 , 223.125,
       225.   , 226.875, 228.75 , 230.625, 232.5  , 234.375, 236.25 , 238.125,
       240.   , 241.875, 243.75 , 245.625, 247.5  , 249.375, 251.25 , 253.125,
       255.   , 256.875, 258.75 , 260.625, 262.5  , 264.375, 266.25 , 268.125,
       270.   , 271.875, 273.75 , 275.625, 277.5  , 279.375, 281.25 , 283.125,
       285.   , 286.875, 288.75 , 290.625, 292.5  , 294.375, 296.25 , 298.125,
       300.   , 301.875, 303.75 , 305.625, 307.5  , 309.375, 311.25 , 313.125,
       315.   , 316.875, 318.75 , 320.625, 322.5  , 324.375, 326.25 , 328.125,
       330.   , 331.875, 333.75 , 335.625, 337.5  , 339.375, 341.25 , 343.125,
       345.   , 346.875, 348.75 , 350.625, 352.5  , 354.375, 356.25 , 358.125])

quantile
()
float64
0.9
```
array(0.9)
```

Attributes: (0)