Welcome to the xskillscore tutorial.

This was created for a talk at the Data Science Study Group: South Florida on April 1 st 2020. The associated slides with the talk can be found here.

The repository for this tutorial is hosted on GitHub here: xskillscore-tutorial.

`xskillscore`

was developed by Ray Bell while at the University of Miami during the SubX project in 2018.

In 2019, Aaron Spring, Andrew Huang and Riley Brady greatly improved `xskillscore`

. Aaron, Andrew and Riley provided upstream fixes and enhancement of `xskillscore`

as it used extensively in climpred.

The verification metrics in `xskillscore`

are split into two types: **deterministic** and **probabilistic**.

**Deterministic** metrics consist of correlation metrics (e.g. pearson r) and distance metrics (e.g. root-mean-square error). These metrics adapt the implementation in `scikit-learn`

and `scipy.stats`

.

**Probabilistic** metrics can be calculated when the forecast consists of multiple forecasts for the same target. Examples, include Continuous Ranked Probability Score and Brier Score.

`xskillscore`

works on `xarray`

objects which requires data to be castable to an `ndarray`

. It works with `numpy.array`

, `pandas.DataFrame`

and `dask.array`

.

You can see the metrics available in `xskillscore`

by running `dir(xs)`

:

In [1]:

```
import xskillscore as xs
dir(xs)
```

Out[1]:

In this notebook I show how `xskillscore`

can be dropped in a typical data science task where the data is a `pandas.DataFrame`

.

I use the metric root-mean-squared error (RMSE) to verify forecasts of items sold.

I also show how you can applies weights to the verification and handle missing values.

This notebook shows how to use probabilistic metrics in a typical data science task where the data is a `pandas.DataFrame`

.

The metric Continuous Ranked Probability Score (CRPS) is used to verify multiple forecasts for the same target.

`xarray`

can handle big data, therefore `xskillscore`

can handle big data.

In this notebook I verify 12 million forecasts in a couple of seconds using the RMSE metric on a `dask.array`

.

This tutorial was adapted from the dask-tutorial.

The interactive session is hosted by Binder and runs on Google Kubernetes Engine (GKE).

In [ ]:

```
```