## 02_Probabilistic.ipynb¶

This notebook shows how to use probabilistic metrics in a typical data science task where the data is a pandas.DataFrame.

The metric Continuous Ranked Probability Score (CRPS) is used to verify multiple forecasts for the same target.

In [1]:
import xarray as xr
import pandas as pd
import numpy as np
import xskillscore as xs


Use the same data as in 01_Deterministic.ipynb:

In [2]:
stores = np.arange(4)
skus = np.arange(3)
dates = pd.date_range("1/1/2020", "1/5/2020", freq="D")

rows = []
for _, date in enumerate(dates):
for _, store in enumerate(stores):
for _, sku in enumerate(skus):
rows.append(
dict(
{
"DATE": date,
"STORE": store,
"SKU": sku,
"QUANTITY_SOLD": np.random.randint(9) + 1,
}
)
)
df = pd.DataFrame(rows)
df.rename(columns={"QUANTITY_SOLD": "y"}, inplace=True)
df.set_index(['DATE', 'STORE', 'SKU'], inplace=True)

Out[2]:
y
DATE STORE SKU
2020-01-01 0 0 9
1 2
2 4
1 0 1
1 5

Instead of making a single prediction as in 01_Deterministic.ipynb we will make multiple forecasts (ensemble forecast). This is akin to more complex methods such as bagging, boosting and stacking.

Do 6 forecasts and append them to the pandas.DataFrame using an extra field called member. This will be saved in a new pandas.DataFrame called df_yhat:

In [3]:
tmp = df.copy()
for i in range(1, 7):
tmp['member'] = i
noise = np.random.uniform(-1, 1, size=len(df['y']))
tmp['yhat'] = (df['y'] + (df['y'] * noise)).astype(int)
if i == 1:
df_yhat = tmp.copy()
else:
df_yhat = df_yhat.append(tmp)
df_yhat

Out[3]:
y member yhat
DATE STORE SKU
2020-01-01 0 0 9 1 10
1 2 1 0
2 4 1 2
1 0 1 1 0
1 5 1 5
... ... ... ... ... ...
2020-01-05 2 1 8 6 7
2 2 6 0
3 0 8 6 8
1 3 6 1
2 5 6 9

360 rows × 3 columns

Drop the y column from df_yhat and add member to the MultiIndex:

In [4]:
df_yhat.drop('y', axis=1, inplace=True)
df_yhat.set_index(['member'], append=True, inplace=True)
df_yhat

Out[4]:
yhat
DATE STORE SKU member
2020-01-01 0 0 1 10
1 1 0
2 1 2
1 0 1 0
1 1 5
... ... ... ... ...
2020-01-05 2 1 6 7
2 6 0
3 0 6 8
1 6 1
2 6 9

360 rows × 1 columns

Convert the target pandas.DataFrame (df) to an xarray.Dataset:

In [5]:
ds = df.to_xarray()
ds

Out[5]:
xarray.Dataset
• DATE: 5
• SKU: 3
• STORE: 4
• DATE
(DATE)
datetime64[ns]
2020-01-01 ... 2020-01-05
array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
'2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
'2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
• STORE
(STORE)
int64
0 1 2 3
array([0, 1, 2, 3])
• SKU
(SKU)
int64
0 1 2
array([0, 1, 2])
• y
(DATE, STORE, SKU)
int64
9 2 4 1 5 8 6 3 ... 3 4 8 8 2 8 3 5
array([[[9, 2, 4],
[1, 5, 8],
[6, 3, 1],
[6, 4, 5]],

[[3, 4, 5],
[5, 6, 8],
[5, 4, 1],
[8, 2, 7]],

[[8, 3, 2],
[4, 4, 2],
[2, 3, 5],
[1, 4, 6]],

[[2, 3, 9],
[2, 7, 1],
[4, 1, 9],
[4, 3, 9]],

[[2, 8, 7],
[7, 3, 4],
[8, 8, 2],
[8, 3, 5]]])

Now add the predicted pandas.DataFrame (df) as an xarray.DataArray called yhat to the xarray.Dataset:

In [6]:
ds['yhat'] = df_yhat.to_xarray()['yhat']
ds

Out[6]:
xarray.Dataset
• DATE: 5
• SKU: 3
• STORE: 4
• member: 6
• DATE
(DATE)
datetime64[ns]
2020-01-01 ... 2020-01-05
array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
'2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
'2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
• STORE
(STORE)
int64
0 1 2 3
array([0, 1, 2, 3])
• SKU
(SKU)
int64
0 1 2
array([0, 1, 2])
• member
(member)
int64
1 2 3 4 5 6
array([1, 2, 3, 4, 5, 6])
• y
(DATE, STORE, SKU)
int64
9 2 4 1 5 8 6 3 ... 3 4 8 8 2 8 3 5
array([[[9, 2, 4],
[1, 5, 8],
[6, 3, 1],
[6, 4, 5]],

[[3, 4, 5],
[5, 6, 8],
[5, 4, 1],
[8, 2, 7]],

[[8, 3, 2],
[4, 4, 2],
[2, 3, 5],
[1, 4, 6]],

[[2, 3, 9],
[2, 7, 1],
[4, 1, 9],
[4, 3, 9]],

[[2, 8, 7],
[7, 3, 4],
[8, 8, 2],
[8, 3, 5]]])
• yhat
(DATE, STORE, SKU, member)
int64
10 10 1 10 16 11 0 ... 5 4 5 8 1 9
array([[[[10, 10,  1, 10, 16, 11],
[ 0,  3,  1,  0,  2,  0],
[ 2,  5,  7,  2,  0,  1]],

[[ 0,  1,  0,  0,  0,  0],
[ 5,  0,  8,  8,  9,  1],
[ 1,  4, 15,  2,  1, 10]],

[[ 0,  3,  5,  4,  5,  5],
[ 3,  2,  0,  1,  4,  1],
[ 1,  1,  0,  0,  0,  1]],

[[10,  3, 11, 10, 10,  0],
[ 3,  5,  3,  3,  1,  2],
[ 4,  3,  6,  4,  0,  2]]],

[[[ 4,  3,  0,  1,  0,  0],
[ 5,  3,  7,  6,  1,  1],
[ 0,  7,  8,  6,  5,  1]],

[[ 5,  1,  8,  6,  3,  6],
[ 6,  9, 11,  4, 10,  6],
[ 0,  5,  5,  6,  3, 15]],

[[ 2,  5,  9,  2,  4,  5],
[ 2,  3,  1,  3,  6,  6],
[ 0,  0,  1,  1,  0,  0]],

[[15,  2, 13, 10,  4,  4],
[ 0,  2,  0,  0,  3,  2],
[ 6,  7,  8,  1, 10,  7]]],

[[[ 6,  1, 12,  7,  5, 10],
[ 3,  0,  0,  1,  1,  3],
[ 1,  2,  0,  2,  2,  3]],

[[ 6,  5,  5,  7,  1,  1],
[ 6,  3,  2,  3,  1,  2],
[ 0,  0,  3,  0,  1,  3]],

[[ 1,  3,  0,  1,  0,  3],
[ 0,  5,  0,  2,  2,  0],
[ 7,  9,  6,  5,  2,  1]],

[[ 0,  1,  0,  1,  0,  0],
[ 4,  5,  6,  1,  7,  4],
[ 0,  7,  4,  6,  5,  0]]],

[[[ 0,  3,  0,  1,  1,  2],
[ 3,  4,  3,  5,  5,  0],
[ 2, 15, 11, 16, 11, 14]],

[[ 3,  2,  2,  3,  1,  1],
[ 8,  9,  2,  6,  0, 12],
[ 0,  0,  0,  1,  0,  1]],

[[ 5,  0,  2,  1,  3,  2],
[ 1,  0,  1,  0,  0,  0],
[ 7,  8,  4,  5,  6,  4]],

[[ 7,  6,  5,  5,  3,  7],
[ 0,  0,  4,  2,  0,  3],
[ 5, 10,  1,  8,  0,  0]]],

[[[ 3,  1,  2,  3,  1,  3],
[ 8, 11, 15,  6, 11, 13],
[12,  7, 13,  5, 12,  8]],

[[ 6,  6,  4,  1,  9, 13],
[ 3,  1,  4,  0,  4,  3],
[ 6,  7,  0,  7,  4,  3]],

[[ 1,  7, 11,  0, 11,  2],
[ 7,  0,  3, 15,  6,  7],
[ 3,  1,  2,  2,  1,  0]],

[[ 6,  2,  1,  2,  9,  8],
[ 5,  0,  4,  1,  2,  1],
[ 5,  4,  5,  8,  1,  9]]]])

Notice how an xarray.Dataset can handle Data variables which have different shape but share some dimensions

# Using xskillscore - CRPS¶

Continuous Ranked Probability Score (CRPS) can also be considered as the probabilistic Mean Absolute Error. It compares the empirical distribution of an ensemble forecast to a scalar observation it is given as:

\begin{align} CRPS = \int_{-\infty}^{\infty} (F(f) - H(f - o))^{2} df \end{align}

where where F(f) is the cumulative distribution function (CDF) of the forecast and H() is the Heaviside step function where the value is 1 if the argument is positive (the prediction overestimates the target or 0 (the prediction is equal to or lower than the target data).

It is not a common verification metric and in most cases the predictions are averaged then verified using deterministic metrics.

For example, you can see averaging on the member dimension gives a better prediction than any individual prediction:

Note: for this we will use the function itself instead of the Accessor method:

In [7]:
avg_member_rmse = xs.rmse(
ds["y"], ds["yhat"].mean(dim="member"), ["DATE", "STORE", "SKU"]
)
print("avg_member_rmse: ", avg_member_rmse.values)
for i in range(len(ds.coords["member"])):
print(f"member {i + 1}:")
ind_member_rmse = xs.rmse(
ds["y"], ds["yhat"].sel(member=i + 1), ["DATE", "STORE", "SKU"]
)
print("ind_member_rmse: ", ind_member_rmse.values)

assert avg_member_rmse < ind_member_rmse

avg_member_rmse:  1.5302505241517372
member 1:
ind_member_rmse:  2.958039891549808
member 2:
ind_member_rmse:  2.717228980659034
member 3:
ind_member_rmse:  3.361547262794322
member 4:
ind_member_rmse:  2.869378562220979
member 5:
ind_member_rmse:  3.082207001484488
member 6:
ind_member_rmse:  3.119829055146024


However, you will see it appear in some Kaggle competitions such as the NFL Big Data Bowl and the Second Annual Data Science Bowl so it's good to have in your arsenal.

The CRPS is only valid over the member dimension and therefore only takes 2 arguments:

In [8]:
ds.xs.crps_ensemble('y', 'yhat')

Out[8]:
xarray.DataArray
• DATE: 5
• STORE: 4
• SKU: 3
• 1.167 0.7222 1.194 0.6944 1.25 ... 1.556 0.3056 1.944 0.8611 0.5556
array([[[1.16666667, 0.72222222, 1.19444444],
[0.69444444, 1.25      , 2.75      ],
[1.44444444, 0.75      , 0.25      ],
[2.22222222, 0.86111111, 1.13888889]],

[[1.16666667, 0.86111111, 0.86111111],
[0.58333333, 0.94444444, 2.33333333],
[0.58333333, 0.80555556, 0.44444444],
[1.94444444, 0.52777778, 0.41666667]],

[[1.19444444, 1.        , 0.16666667],
[0.91666667, 1.02777778, 0.80555556],
[0.66666667, 1.25      , 0.77777778],
[0.44444444, 0.47222222, 1.16666667]],

[[0.58333333, 0.44444444, 2.47222222],
[0.22222222, 1.19444444, 0.44444444],
[1.30555556, 0.44444444, 2.5       ],
[1.08333333, 0.97222222, 3.16666667]],

[[0.36111111, 1.66666667, 1.52777778],
[1.08333333, 0.36111111, 0.80555556],
[2.16666667, 1.55555556, 0.30555556],
[1.94444444, 0.86111111, 0.55555556]]])
• DATE
(DATE)
datetime64[ns]
2020-01-01 ... 2020-01-05
array(['2020-01-01T00:00:00.000000000', '2020-01-02T00:00:00.000000000',
'2020-01-03T00:00:00.000000000', '2020-01-04T00:00:00.000000000',
'2020-01-05T00:00:00.000000000'], dtype='datetime64[ns]')
• STORE
(STORE)
int64
0 1 2 3
array([0, 1, 2, 3])
• SKU
(SKU)
int64
0 1 2
array([0, 1, 2])

To return an overall CRPS it is recommended to average over all dimensions before using crps:

In [9]:
y = ds['y'].mean(dim=['DATE', 'STORE', 'SKU'])
yhat = ds['yhat'].mean(dim=['DATE', 'STORE', 'SKU'])
xs.crps_ensemble(y, yhat)

Out[9]:
xarray.DataArray
• 0.5648
array(0.56481481)
In [ ]: