This notebook shows how to use probabilistic metrics in a typical data science task where the data is a pandas.DataFrame.

The metric Continuous Ranked Probability Score (CRPS) is used to verify multiple forecasts for the same target.

In [1]:
import xarray as xr
import pandas as pd
import numpy as np
import xskillscore as xs

Use the same data as in 01_Deterministic.ipynb:

In [2]:
stores = np.arange(4)
skus = np.arange(3)
dates = pd.date_range("1/1/2020", "1/5/2020", freq="D")

rows = []
for _, date in enumerate(dates):
    for _, store in enumerate(stores):
        for _, sku in enumerate(skus):
            rows.append(
                dict(
                    {
                        "DATE": date,
                        "STORE": store,
                        "SKU": sku,
                        "QUANTITY_SOLD": np.random.randint(9) + 1,
                    }
                )
            )
df = pd.DataFrame(rows)
df.rename(columns={"QUANTITY_SOLD": "y"}, inplace=True)
df.set_index(['DATE', 'STORE', 'SKU'], inplace=True)
df.head()
Out[2]:
y
DATE STORE SKU
2020-01-01 0 0 9
1 2
2 4
1 0 1
1 5

Instead of making a single prediction as in 01_Deterministic.ipynb we will make multiple forecasts (ensemble forecast). This is akin to more complex methods such as bagging, boosting and stacking.

Do 6 forecasts and append them to the pandas.DataFrame using an extra field called member. This will be saved in a new pandas.DataFrame called df_yhat:

In [3]:
tmp = df.copy()
for i in range(1, 7):
    tmp['member'] = i
    noise = np.random.uniform(-1, 1, size=len(df['y']))
    tmp['yhat'] = (df['y'] + (df['y'] * noise)).astype(int)
    if i == 1:
        df_yhat = tmp.copy()
    else:
        df_yhat = df_yhat.append(tmp)
df_yhat
Out[3]:
y member yhat
DATE STORE SKU
2020-01-01 0 0 9 1 10
1 2 1 0
2 4 1 2
1 0 1 1 0
1 5 1 5
... ... ... ... ... ...
2020-01-05 2 1 8 6 7
2 2 6 0
3 0 8 6 8
1 3 6 1
2 5 6 9

360 rows × 3 columns

Drop the y column from df_yhat and add member to the MultiIndex:

In [4]:
df_yhat.drop('y', axis=1, inplace=True)
df_yhat.set_index(['member'], append=True, inplace=True)
df_yhat
Out[4]:
yhat
DATE STORE SKU member
2020-01-01 0 0 1 10
1 1 0
2 1 2
1 0 1 0
1 1 5
... ... ... ... ...
2020-01-05 2 1 6 7
2 6 0
3 0 6 8
1 6 1
2 6 9

360 rows × 1 columns

Convert the target pandas.DataFrame (df) to an xarray.Dataset:

In [5]:
ds = df.to_xarray()
ds
Out[5]:
Show/Hide data repr Show/Hide attributes
xarray.Dataset
    • DATE: 5
    • SKU: 3
    • STORE: 4
    • DATE
      (DATE)
      datetime64[ns]
      2020-01-01 ... 2020-01-05
      array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
             '2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
             '2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
    • STORE
      (STORE)
      int64
      0 1 2 3
      array([0, 1, 2, 3])
    • SKU
      (SKU)
      int64
      0 1 2
      array([0, 1, 2])
    • y
      (DATE, STORE, SKU)
      int64
      9 2 4 1 5 8 6 3 ... 3 4 8 8 2 8 3 5
      array([[[9, 2, 4],
              [1, 5, 8],
              [6, 3, 1],
              [6, 4, 5]],
      
             [[3, 4, 5],
              [5, 6, 8],
              [5, 4, 1],
              [8, 2, 7]],
      
             [[8, 3, 2],
              [4, 4, 2],
              [2, 3, 5],
              [1, 4, 6]],
      
             [[2, 3, 9],
              [2, 7, 1],
              [4, 1, 9],
              [4, 3, 9]],
      
             [[2, 8, 7],
              [7, 3, 4],
              [8, 8, 2],
              [8, 3, 5]]])

Now add the predicted pandas.DataFrame (df) as an xarray.DataArray called yhat to the xarray.Dataset:

In [6]:
ds['yhat'] = df_yhat.to_xarray()['yhat']
ds
Out[6]:
Show/Hide data repr Show/Hide attributes
xarray.Dataset
    • DATE: 5
    • SKU: 3
    • STORE: 4
    • member: 6
    • DATE
      (DATE)
      datetime64[ns]
      2020-01-01 ... 2020-01-05
      array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
             '2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
             '2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
    • STORE
      (STORE)
      int64
      0 1 2 3
      array([0, 1, 2, 3])
    • SKU
      (SKU)
      int64
      0 1 2
      array([0, 1, 2])
    • member
      (member)
      int64
      1 2 3 4 5 6
      array([1, 2, 3, 4, 5, 6])
    • y
      (DATE, STORE, SKU)
      int64
      9 2 4 1 5 8 6 3 ... 3 4 8 8 2 8 3 5
      array([[[9, 2, 4],
              [1, 5, 8],
              [6, 3, 1],
              [6, 4, 5]],
      
             [[3, 4, 5],
              [5, 6, 8],
              [5, 4, 1],
              [8, 2, 7]],
      
             [[8, 3, 2],
              [4, 4, 2],
              [2, 3, 5],
              [1, 4, 6]],
      
             [[2, 3, 9],
              [2, 7, 1],
              [4, 1, 9],
              [4, 3, 9]],
      
             [[2, 8, 7],
              [7, 3, 4],
              [8, 8, 2],
              [8, 3, 5]]])
    • yhat
      (DATE, STORE, SKU, member)
      int64
      10 10 1 10 16 11 0 ... 5 4 5 8 1 9
      array([[[[10, 10,  1, 10, 16, 11],
               [ 0,  3,  1,  0,  2,  0],
               [ 2,  5,  7,  2,  0,  1]],
      
              [[ 0,  1,  0,  0,  0,  0],
               [ 5,  0,  8,  8,  9,  1],
               [ 1,  4, 15,  2,  1, 10]],
      
              [[ 0,  3,  5,  4,  5,  5],
               [ 3,  2,  0,  1,  4,  1],
               [ 1,  1,  0,  0,  0,  1]],
      
              [[10,  3, 11, 10, 10,  0],
               [ 3,  5,  3,  3,  1,  2],
               [ 4,  3,  6,  4,  0,  2]]],
      
      
             [[[ 4,  3,  0,  1,  0,  0],
               [ 5,  3,  7,  6,  1,  1],
               [ 0,  7,  8,  6,  5,  1]],
      
              [[ 5,  1,  8,  6,  3,  6],
               [ 6,  9, 11,  4, 10,  6],
               [ 0,  5,  5,  6,  3, 15]],
      
              [[ 2,  5,  9,  2,  4,  5],
               [ 2,  3,  1,  3,  6,  6],
               [ 0,  0,  1,  1,  0,  0]],
      
              [[15,  2, 13, 10,  4,  4],
               [ 0,  2,  0,  0,  3,  2],
               [ 6,  7,  8,  1, 10,  7]]],
      
      
             [[[ 6,  1, 12,  7,  5, 10],
               [ 3,  0,  0,  1,  1,  3],
               [ 1,  2,  0,  2,  2,  3]],
      
              [[ 6,  5,  5,  7,  1,  1],
               [ 6,  3,  2,  3,  1,  2],
               [ 0,  0,  3,  0,  1,  3]],
      
              [[ 1,  3,  0,  1,  0,  3],
               [ 0,  5,  0,  2,  2,  0],
               [ 7,  9,  6,  5,  2,  1]],
      
              [[ 0,  1,  0,  1,  0,  0],
               [ 4,  5,  6,  1,  7,  4],
               [ 0,  7,  4,  6,  5,  0]]],
      
      
             [[[ 0,  3,  0,  1,  1,  2],
               [ 3,  4,  3,  5,  5,  0],
               [ 2, 15, 11, 16, 11, 14]],
      
              [[ 3,  2,  2,  3,  1,  1],
               [ 8,  9,  2,  6,  0, 12],
               [ 0,  0,  0,  1,  0,  1]],
      
              [[ 5,  0,  2,  1,  3,  2],
               [ 1,  0,  1,  0,  0,  0],
               [ 7,  8,  4,  5,  6,  4]],
      
              [[ 7,  6,  5,  5,  3,  7],
               [ 0,  0,  4,  2,  0,  3],
               [ 5, 10,  1,  8,  0,  0]]],
      
      
             [[[ 3,  1,  2,  3,  1,  3],
               [ 8, 11, 15,  6, 11, 13],
               [12,  7, 13,  5, 12,  8]],
      
              [[ 6,  6,  4,  1,  9, 13],
               [ 3,  1,  4,  0,  4,  3],
               [ 6,  7,  0,  7,  4,  3]],
      
              [[ 1,  7, 11,  0, 11,  2],
               [ 7,  0,  3, 15,  6,  7],
               [ 3,  1,  2,  2,  1,  0]],
      
              [[ 6,  2,  1,  2,  9,  8],
               [ 5,  0,  4,  1,  2,  1],
               [ 5,  4,  5,  8,  1,  9]]]])

Notice how an xarray.Dataset can handle Data variables which have different shape but share some dimensions

Using xskillscore - CRPS

Continuous Ranked Probability Score (CRPS) can also be considered as the probabilistic Mean Absolute Error. It compares the empirical distribution of an ensemble forecast to a scalar observation it is given as:

\begin{align} CRPS = \int_{-\infty}^{\infty} (F(f) - H(f - o))^{2} df \end{align}

where where F(f) is the cumulative distribution function (CDF) of the forecast and H() is the Heaviside step function where the value is 1 if the argument is positive (the prediction overestimates the target or 0 (the prediction is equal to or lower than the target data).

See https://climpred.readthedocs.io/en/stable/metrics.html#continuous-ranked-probability-score-crps for further documentation.

It is not a common verification metric and in most cases the predictions are averaged then verified using deterministic metrics.

For example, you can see averaging on the member dimension gives a better prediction than any individual prediction:

Note: for this we will use the function itself instead of the Accessor method:

In [7]:
avg_member_rmse = xs.rmse(
    ds["y"], ds["yhat"].mean(dim="member"), ["DATE", "STORE", "SKU"]
)
print("avg_member_rmse: ", avg_member_rmse.values)
for i in range(len(ds.coords["member"])):
    print(f"member {i + 1}:")
    ind_member_rmse = xs.rmse(
        ds["y"], ds["yhat"].sel(member=i + 1), ["DATE", "STORE", "SKU"]
    )
    print("ind_member_rmse: ", ind_member_rmse.values)
    
    assert avg_member_rmse < ind_member_rmse
avg_member_rmse:  1.5302505241517372
member 1:
ind_member_rmse:  2.958039891549808
member 2:
ind_member_rmse:  2.717228980659034
member 3:
ind_member_rmse:  3.361547262794322
member 4:
ind_member_rmse:  2.869378562220979
member 5:
ind_member_rmse:  3.082207001484488
member 6:
ind_member_rmse:  3.119829055146024

However, you will see it appear in some Kaggle competitions such as the NFL Big Data Bowl and the Second Annual Data Science Bowl so it's good to have in your arsenal.

The CRPS is only valid over the member dimension and therefore only takes 2 arguments:

In [8]:
ds.xs.crps_ensemble('y', 'yhat')
Out[8]:
Show/Hide data repr Show/Hide attributes
xarray.DataArray
  • DATE: 5
  • STORE: 4
  • SKU: 3
  • 1.167 0.7222 1.194 0.6944 1.25 ... 1.556 0.3056 1.944 0.8611 0.5556
    array([[[1.16666667, 0.72222222, 1.19444444],
            [0.69444444, 1.25      , 2.75      ],
            [1.44444444, 0.75      , 0.25      ],
            [2.22222222, 0.86111111, 1.13888889]],
    
           [[1.16666667, 0.86111111, 0.86111111],
            [0.58333333, 0.94444444, 2.33333333],
            [0.58333333, 0.80555556, 0.44444444],
            [1.94444444, 0.52777778, 0.41666667]],
    
           [[1.19444444, 1.        , 0.16666667],
            [0.91666667, 1.02777778, 0.80555556],
            [0.66666667, 1.25      , 0.77777778],
            [0.44444444, 0.47222222, 1.16666667]],
    
           [[0.58333333, 0.44444444, 2.47222222],
            [0.22222222, 1.19444444, 0.44444444],
            [1.30555556, 0.44444444, 2.5       ],
            [1.08333333, 0.97222222, 3.16666667]],
    
           [[0.36111111, 1.66666667, 1.52777778],
            [1.08333333, 0.36111111, 0.80555556],
            [2.16666667, 1.55555556, 0.30555556],
            [1.94444444, 0.86111111, 0.55555556]]])
    • DATE
      (DATE)
      datetime64[ns]
      2020-01-01 ... 2020-01-05
      array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
             '2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
             '2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
    • STORE
      (STORE)
      int64
      0 1 2 3
      array([0, 1, 2, 3])
    • SKU
      (SKU)
      int64
      0 1 2
      array([0, 1, 2])

To return an overall CRPS it is recommended to average over all dimensions before using crps:

In [9]:
y = ds['y'].mean(dim=['DATE', 'STORE', 'SKU'])
yhat = ds['yhat'].mean(dim=['DATE', 'STORE', 'SKU'])
xs.crps_ensemble(y, yhat)
Out[9]:
Show/Hide data repr Show/Hide attributes
xarray.DataArray
  • 0.5648
    array(0.56481481)
    In [ ]: