In forecasting, past data is used to make predictions about future values of a time series. This is distinct from learning tasks such classification, regression and clustering, where it may not be future values being predicted.
This notebook gives a tutorial on forecasting. More specific notebooks in aeon on forecasting are:
aeon
provides a common, scikit-learn
-like interface to a variety of classical and ML-style forecasting algorithms, together with tools for building pipelines and composite machine learning models, including temporal tuning schemes, or reductions such as walk-forward application of scikit-learn
regressors.
Section 1 provides an overview of common forecasting workflows supported by aeon
.
Section 2 discusses the families of forecasters available in aeon
.
Section 3 discusses advanced composition patterns, including pipeline building, reduction, tuning, ensembling, and autoML.
Section 4 gives an introduction to how to write custom estimators compliant with the aeon
interface.
import warnings
import numpy as np
import pandas as pd
# hide warnings
warnings.filterwarnings("ignore")
This section explains the basic forecasting workflows, and key interface points for it.
We cover the following four workflows:
observations;
split-wise and aggregate errors, including common back-testing schemes.
All forecasting workflows make common assumptions on the input data format.
aeon
uses pandas
for representing time series for forecasting:
pd.DataFrame
for time series and sequences. Rows represent time indices, columnsrepresent variables;
pd.Series
can be used for univariate time series;numpy
arrays (1D and 2D) can also used.The Series.index
and DataFrame.index
are used for representing the time series index.
aeon
supports pandas integer, period and timestamp indices for simple time series.
aeon
supports additional data structures for panel and hierarchical
time series, see the data structure notebooks for more
details.
Example: as the running example in this tutorial, we use a textbook data set, the Box-Jenkins airline data set, which consists of the number of monthly totals of international airline passengers, from 1949 - 1960. Values are in thousands. See "Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications", exercises sections 2 and 3.
from aeon.datasets import load_airline
from aeon.visualisation import plot_series
y = load_airline()
# plotting for visualization
plot_series(y)
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
y.index
PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06', '1949-07', '1949-08', '1949-09', '1949-10', ... '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08', '1960-09', '1960-10', '1960-11', '1960-12'], dtype='period[M]', name='Period', length=144)
The simplest use case is batch fitting and forecasting, i.e., fitting a forecasting model to one batch of past data, then asking for forecasts at time point in the future.
The steps in this workflow are as follows:
numpy.array
or the ForecastingHorizon
object.scikit-learn
-like syntax; forecaster objects follow the familiar scikit-learn
BaseEstimator
interface.fit
methodpredict
methodThe below first outlines the vanilla variant of the basic deployment workflow, step-by-step.
At the end, one-cell workflows are provided, with common deviations from the pattern (Sections 1.2.1 and following).
We assume here the data is assumed to be in pd.Series
or pd.DataFrame
format.
from aeon.datasets import load_airline
from aeon.visualisation import plot_series
# in the example, we use the airline data set.
y = load_airline()
plot_series(y)
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
Now we need to specify the forecasting horizon and pass that to our forecasting algorithm.
There are two main ways of doing this:
numpy.array
of integers. This assumes either integer index or periodic index (PeriodIndex
) in the time series; the integer indicates the number of time points or periods ahead we want to make a forecast for. E.g., 1
means forecast the next period, 2
the second next period, and so on.ForecastingHorizon
object. This can be used to define forecast horizons, using any supported index type as an argument. No periodic index is assumed.Forecasting horizons can be absolute, i.e., referencing specific time points in the future, or relative, i.e., referencing time differences to the present. As a default, the present is that latest time point seen in any y
passed to the forecaster.
numpy.array
based forecasting horizons are always relative; ForecastingHorizon
objects can be both relative and absolute. In particular, absolute forecasting horizons can only be specified using ForecastingHorizon
.
numpy
forecasting horizon¶fh = np.arange(1, 37)
fh
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36])
This will ask for monthly predictions for the next three years, since the original series period is 1 month. To predict only the second and fifth month ahead, one would pass a list as follows:
import numpy as np
fh = np.array([2, 5]) # 2nd and 5th step ahead
ForecastingHorizon
based forecasting horizon¶The ForecastingHorizon
object considers the input absolute or relative depending on the is_relative
flag.
ForecastingHorizon
will automatically assume a relative horizon if temporal difference types from pandas
are passed; if value types from pandas
are passed, it will assume an absolute horizon.
To define an absolute ForecastingHorizon
in our example:
from aeon.forecasting.base import ForecastingHorizon
fh = ForecastingHorizon(
pd.PeriodIndex(pd.date_range("1961-01", periods=36, freq="M")), is_relative=False
)
fh
ForecastingHorizon(['1961-01', '1961-02', '1961-03', '1961-04', '1961-05', '1961-06', '1961-07', '1961-08', '1961-09', '1961-10', '1961-11', '1961-12', '1962-01', '1962-02', '1962-03', '1962-04', '1962-05', '1962-06', '1962-07', '1962-08', '1962-09', '1962-10', '1962-11', '1962-12', '1963-01', '1963-02', '1963-03', '1963-04', '1963-05', '1963-06', '1963-07', '1963-08', '1963-09', '1963-10', '1963-11', '1963-12'], dtype='period[M]', is_relative=False)
A ForecastingHorizon
can be converted from relative to absolute and back via the
to_relative
and to_absolute
methods. Both of these conversions require a compatible cutoff
to be passed:
cutoff = pd.Period("1960-12", freq="M")
fh.to_relative(cutoff)
ForecastingHorizon([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36], dtype='int32', is_relative=True)
fh.to_absolute(cutoff)
ForecastingHorizon(['1961-01', '1961-02', '1961-03', '1961-04', '1961-05', '1961-06', '1961-07', '1961-08', '1961-09', '1961-10', '1961-11', '1961-12', '1962-01', '1962-02', '1962-03', '1962-04', '1962-05', '1962-06', '1962-07', '1962-08', '1962-09', '1962-10', '1962-11', '1962-12', '1963-01', '1963-02', '1963-03', '1963-04', '1963-05', '1963-06', '1963-07', '1963-08', '1963-09', '1963-10', '1963-11', '1963-12'], dtype='period[M]', is_relative=False)
To make forecasts, a forecasting algorithm needs to be specified. All aeon
forecasters follow the same interface, so the steps are the same, no matter which forecaster is chosen.
For this example, we choose the naive forecasting method of predicting the last seen value. More complex specifications are possible, using pipeline and reduction construction syntax; this will be covered later in Section 2.
from aeon.forecasting.naive import NaiveForecaster
forecaster = NaiveForecaster(strategy="last")
Now the forecaster needs to be fitted to the seen data:
forecaster.fit(y)
NaiveForecaster()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
NaiveForecaster()
Finally, we request forecasts for the specified forecasting horizon. This needs to be done after fitting the forecaster:
y_pred = forecaster.predict(fh)
# plotting predictions and past data
plot_series(y, y_pred, labels=["y", "y_pred"])
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
For convenience, we present the basic deployment workflow in one cell.
This uses the same data, but different version of the NiaveForecaster that predicts the
latest value observed in the same month (the parameter sp
is the seasonal
periodicity).
from aeon.datasets import load_airline
from aeon.forecasting.base import ForecastingHorizon
from aeon.forecasting.naive import NaiveForecaster
# step 1: data specification
y = load_airline()
# step 2: specifying forecasting horizon
fh = np.arange(1, 37)
# step 3: specifying the forecasting algorithm
forecaster = NaiveForecaster(strategy="last", sp=12)
# step 4: fitting the forecaster
forecaster.fit(y)
# step 5: querying predictions
y_pred = forecaster.predict(fh)
# optional: plotting predictions and past data
plot_series(y, y_pred, labels=["y", "y_pred"])
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
fit
¶Some forecasters need the forecasting horizon to be provided in fit
. Such
forecasters will produce informative error messages when it is not passed in fit
.
All forecaster will remember the horizon when passed in fit
for prediction. An
example:
# step 1: data specification
y = load_airline()
# step 2: specifying forecasting horizon
fh = np.arange(1, 37)
# step 3: specifying the forecasting algorithm
forecaster = NaiveForecaster(strategy="last", sp=12)
# step 4: fitting the forecaster
forecaster.fit(y, fh=fh)
# step 5: querying predictions
y_pred = forecaster.predict()
Many forecasters can make use of exogeneous time series, i.e., other time series that
are not to be forecast, but are useful for forecasting y
. Exogeneous time series are
always passed as an X
argument, in fit
, predict
, and other methods (see below). Exogeneous time series should always be passed as pandas.DataFrames
. Most forecasters that can deal with exogeneous time series will assume that the time indices of X
passed to fit
are a super-set of the time indices in y
passed to fit
; and that the time indices of X
passed to predict
are a super-set of time indices in fh
, although this is not a general interface restriction. Forecasters that do not make use of exogeneous time series still accept the argument (and do not use it internally).
The general workflow for passing exogeneous data is as follows:
# step 1: data specification
y = load_airline()
# we create some dummy exogeneous data
X = pd.DataFrame(index=y.index)
# step 2: specifying forecasting horizon
fh = np.arange(1, 37)
# step 3: specifying the forecasting algorithm
forecaster = NaiveForecaster(strategy="last", sp=12)
# step 4: fitting the forecaster
forecaster.fit(y, X=X, fh=fh)
# step 5: querying predictions
y_pred = forecaster.predict(X=X)
NOTE: as in workflows 1.2.1 and 1.2.2, some forecasters that use exogeneous variables may also require the forecasting horizon only in predict
. Such forecasters may also be called with steps 4 and 5 being
forecaster.fit(y, X=X)
y_pred = forecaster.predict(fh=fh, X=X)
All forecasters in aeon
support multivariate forecasts - some forecasters are "genuine" multivariate, all others "apply by column".
Below is an example of the general multivariate forecasting workflow, using the VAR
(vector auto-regression) forecaster on the Longley dataset from aeon.datasets
. The workflow is the same as in the univariate forecasters, but the input has more than one variables (columns).
from aeon.datasets import load_longley
from aeon.forecasting.var import VAR
_, y = load_longley()
y = y.drop(columns=["UNEMP", "ARMED", "POP"])
forecaster = VAR()
forecaster.fit(y, fh=[1, 2, 3])
y_pred = forecaster.predict()
The input to the multivariate forecaster y
is a pandas.DataFrame
where each column is a variable.
y
GNPDEFL | GNP | |
---|---|---|
Period | ||
1947 | 83.0 | 234289.0 |
1948 | 88.5 | 259426.0 |
1949 | 88.2 | 258054.0 |
1950 | 89.5 | 284599.0 |
1951 | 96.2 | 328975.0 |
1952 | 98.1 | 346999.0 |
1953 | 99.0 | 365385.0 |
1954 | 100.0 | 363112.0 |
1955 | 101.2 | 397469.0 |
1956 | 104.6 | 419180.0 |
1957 | 108.4 | 442769.0 |
1958 | 110.8 | 444546.0 |
1959 | 112.6 | 482704.0 |
1960 | 114.2 | 502601.0 |
1961 | 115.7 | 518173.0 |
1962 | 116.9 | 554894.0 |
The result of the multivariate forecaster y_pred
is a pandas.DataFrame
where columns are the predicted values for each variable. The variables in y_pred
are the same as in y
, the input to the multivariate forecaster.
y_pred
GNPDEFL | GNP | |
---|---|---|
Period | ||
1963 | 121.688295 | 578514.398653 |
1964 | 124.353664 | 601873.015890 |
1965 | 126.847886 | 625411.588754 |
There are two categories of multivariate forecasters:
VAR
. Forecasts for oneendogeneous (y
) variable will depend on values of other variables (X
).
ARIMA
. Forecasts will be made byendogeneous (y
) variable, and not be affected by other variables (X
).
To display complete list of multivariate forecasters, search for forecasters with 'multivariate'
or 'both'
tag value for the tag 'y_input_type'
, as follows:
import warnings
from aeon.registry import all_estimators
warnings.filterwarnings("ignore")
for forecaster in all_estimators(
filter_tags={"y_input_type": ["multivariate", "both"]}
):
print(forecaster[0])
ColumnEnsembleForecaster DynamicFactor EnsembleForecaster ForecastByLevel ForecastingGridSearchCV ForecastingPipeline ForecastingRandomizedSearchCV HierarchyEnsembleForecaster MockForecaster MultiplexForecaster Permute TransformedTargetForecaster VAR VARMAX VECM YtoX
Univariate forecasters have tag value 'univariate'
, and will fit one model per column.
To access the column-wise models, access the forecasters_
parameter, which stores the fitted forecasters in a pandas.DataFrame
, fitted forecasters being in the column with the variable for which the forecast is being made:
from aeon.datasets import load_longley
from aeon.forecasting.arima import ARIMA
_, y = load_longley()
y = y.drop(columns=["UNEMP", "ARMED", "POP"])
forecaster = ARIMA()
forecaster.fit(y, fh=[1, 2, 3])
forecaster.forecasters_
GNPDEFL | GNP | |
---|---|---|
forecasters | ARIMA() | ARIMA() |
aeon
provides a unified interface to make probabilistic forecasts.
The following methods are possibly available for probabilistic forecasts:
predict_interval
produces interval forecasts. Additionally to any predict
arguments, an argument coverage
(nominal interval coverage) must be provided.predict_quantiles
produces quantile forecasts. Additionally to any predict
arguments, an argument alpha
(quantile values) must be provided.predict_var
produces variance forecasts. This has same arguments as predict
.predict_proba
produces full distributional forecasts. This has same arguments as predict
.Not all forecasters are capable of returning probabilistic forecast, but if a forecasters provides one kind of probabilistic forecast, it is also capable of returning the others. The list of forecasters with such capability can be queried by registry.all_estimators
, searching for those where the capability:pred_int
tag has valueTrue
.
The basic worfklow for probabilistic forecasts is similar to the basic forecasting workflow, with the difference that instead of predict
, one of the probabilistic forecasting methods is used:
import numpy as np
from aeon.datasets import load_airline
from aeon.forecasting.naive import NaiveForecaster
# until fit, identical with the simple workflow
y = load_airline()
fh = np.arange(1, 13)
forecaster = NaiveForecaster(sp=12)
forecaster.fit(y, fh=fh)
NaiveForecaster(sp=12)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
NaiveForecaster(sp=12)
Now we present the different probabilistic forecasting methods.
predict_interval
- interval predictions¶predict_interval
takes an argument coverage
, which is a float (or list of floats), the nominal coverage of the prediction interval(s) queried. predict_interval
produces symmetric prediction intervals, for example, a coverage of 0.9
returns a "lower" forecast at quantile 0.5 - coverage/2 = 0.05
, and an "upper" forecast at quantile 0.5 + coverage/2 = 0.95
.
coverage = 0.9
y_pred_ints = forecaster.predict_interval(coverage=coverage)
y_pred_ints
Coverage | ||
---|---|---|
0.9 | ||
lower | upper | |
1961-01 | 357.265915 | 476.734085 |
1961-02 | 331.265915 | 450.734085 |
1961-03 | 359.265915 | 478.734085 |
1961-04 | 401.265915 | 520.734085 |
1961-05 | 412.265915 | 531.734085 |
1961-06 | 475.265915 | 594.734085 |
1961-07 | 562.265915 | 681.734085 |
1961-08 | 546.265915 | 665.734085 |
1961-09 | 448.265915 | 567.734085 |
1961-10 | 401.265915 | 520.734085 |
1961-11 | 330.265915 | 449.734085 |
1961-12 | 372.265915 | 491.734085 |
The return y_pred_ints
is a pandas.DataFrame
with a column multi-index: The first level is variable name from y
in fit (or Coverage
if no variable names were present), second level coverage fractions for which intervals were computed, in the same order as in input coverage
; third level columns lower
and upper
. Rows are the indices for which forecasts were made (same as in y_pred
or fh
). Entries are lower/upper (as column name) bound of the nominal coverage predictive interval for the index in the same row.
Pretty-plotting the predictive interval forecasts:
from aeon.visualisation import plot_series
# also requires predictions
y_pred = forecaster.predict()
fig, ax = plot_series(y, y_pred, labels=["y", "y_pred"], pred_interval=y_pred_ints)
predict_quantiles
- quantile forecasts¶aeon offers predict_quantiles
as a unified interface to return quantile values of predictions. Similar to predict_interval
.
predict_quantiles
has an argument alpha
, containing the quantile values being queried. Similar to the case of the predict_interval
, alpha
can be a float
, or a list of floats
.
y_pred_quantiles = forecaster.predict_quantiles(alpha=[0.275, 0.975])
y_pred_quantiles
Quantiles | ||
---|---|---|
0.275 | 0.975 | |
1961-01 | 395.291896 | 488.177552 |
1961-02 | 369.291896 | 462.177552 |
1961-03 | 397.291896 | 490.177552 |
1961-04 | 439.291896 | 532.177552 |
1961-05 | 450.291896 | 543.177552 |
1961-06 | 513.291896 | 606.177552 |
1961-07 | 600.291896 | 693.177552 |
1961-08 | 584.291896 | 677.177552 |
1961-09 | 486.291896 | 579.177552 |
1961-10 | 439.291896 | 532.177552 |
1961-11 | 368.291896 | 461.177552 |
1961-12 | 410.291896 | 503.177552 |
y_pred_quantiles
, the output of predict_quantiles, is a pandas.DataFrame
with a two-level column multiindex. The first level is variable name from y
in fit (or Quantiles
if no variable names were present), second level are the quantile values (from alpha
) for which quantile predictions were queried. Rows are the indices for which forecasts were made (same as in y_pred
or fh
). Entries are the quantile predictions for that variable, that quantile value, for the time index in the same row.
Remark: for clarity: quantile and (symmetric) interval forecasts can be translated into each other as follows.
alpha < 0.5: The alpha-quantile prediction is equal to the lower bound of a predictive interval with coverage = (0.5 - alpha) * 2
alpha > 0.5: The alpha-quantile prediction is equal to the upper bound of a predictive interval with coverage = (alpha - 0.5) * 2
predict_var
- variance predictions¶predict_var
produces variance predictions:
y_pred_var = forecaster.predict_var()
y_pred_var
0 | |
---|---|
1961-01 | 1318.833333 |
1961-02 | 1318.833333 |
1961-03 | 1318.833333 |
1961-04 | 1318.833333 |
1961-05 | 1318.833333 |
1961-06 | 1318.833333 |
1961-07 | 1318.833333 |
1961-08 | 1318.833333 |
1961-09 | 1318.833333 |
1961-10 | 1318.833333 |
1961-11 | 1318.833333 |
1961-12 | 1318.833333 |
The format of the output y_pred_var
is the same as for predict
, except that this is always coerced to a pandas.DataFrame
, and entries are not point predictions but variance predictions.
predict_proba
- distribution predictions¶To predict full predictive distributions, predict_proba
can be used.
As this returns tensorflow
Distribution
objects, the deep learning dependency set dl
of aeon
(which includes tensorflow
and tensorflow-probability
dependencies) must be installed.
y_pred_proba = forecaster.predict_proba()
y_pred_proba
<tfp.distributions.Normal 'Normal' batch_shape=[12, 1] event_shape=[] dtype=float32>
Distributions returned by predict_proba
are by default marginal at time points, not joint over time points.
More precisely, the returned Distribution
object is formatted and to be interpreted as follows:
fit
/update
To return joint forecast distributions, the marginal
parameter can be set to False
(currently work in progress). In this case, a Distribution
with 2D event shape (len(fh), len(y))
is returned.
aeon
provides a unified interface to make panel and hierarchical forecasts.
All aeon
forecasters can be applied to panel and hierarchical data, which needs to be presented in specific input formats.
Forecasters that are not genuinely panel or hierarchical forecasters will be applied by instance.
The recommended (not the only) format to pass panel and hierarchical data is a pandas.DataFrame
with MultiIndex
row. In this MultiIndex
, the last level must be in an aeon
compatible time index format, the remaining levels are panel or hierarchy nodes.
Example data:
from aeon.testing.utils.data_gen import _bottom_hier_datagen
y = _bottom_hier_datagen(no_levels=2)
y
passengers | |||
---|---|---|---|
l2_agg | l1_agg | timepoints | |
l2_node01 | l1_node02 | 1949-01 | 52.943320 |
1949-02 | 54.874511 | ||
1949-03 | 59.379565 | ||
1949-04 | 58.414313 | ||
1949-05 | 55.840000 | ||
... | ... | ... | ... |
l2_node03 | l1_node05 | 1960-08 | 2221.886565 |
1960-09 | 1801.134311 | ||
1960-10 | 1606.045386 | ||
1960-11 | 1320.202973 | ||
1960-12 | 1487.967083 |
864 rows × 1 columns
As stated, all forecasters, genuinely hierarchical or not, can be applied, with all workflows described in this section, to produce hierarchical forecasts.
The syntax is exactly the same as for plain time series, except for the hierarchy levels in input and output data:
from aeon.forecasting.arima import ARIMA
fh = [1, 2, 3]
forecaster = ARIMA()
forecaster.fit(y, fh=fh)
forecaster.predict()
passengers | |||
---|---|---|---|
l2_agg | l1_agg | timepoints | |
l2_node01 | l1_node02 | 1961-01 | 154.000381 |
1961-02 | 152.311358 | ||
1961-03 | 150.682121 | ||
l1_node03 | 1961-01 | 2805.041069 | |
1961-02 | 2770.577861 | ||
1961-03 | 2737.335536 | ||
l2_node02 | l1_node01 | 1961-01 | 426.544850 |
1961-02 | 421.282983 | ||
1961-03 | 416.207550 | ||
l1_node04 | 1961-01 | 892.307468 | |
1961-02 | 879.432170 | ||
1961-03 | 867.062017 | ||
l1_node06 | 1961-01 | 3237.664605 | |
1961-02 | 3192.489057 | ||
1961-03 | 3149.040796 | ||
l2_node03 | l1_node05 | 1961-01 | 1465.123534 |
1961-02 | 1443.194107 | ||
1961-03 | 1422.142219 |
Similar to multivariate forecasting, forecasters that are not genuinely hierarchical fit by instance.
The forecasters fitted by instance can be accessed in the forecasters_
parameter, which is a pandas.DataFrame
where forecasters for a given instance are placed in the row with the index of the instance for which they make forecasts:
forecaster.forecasters_
forecasters | ||
---|---|---|
l2_agg | l1_agg | |
l2_node01 | l1_node02 | ARIMA() |
l1_node03 | ARIMA() | |
l2_node02 | l1_node01 | ARIMA() |
l1_node04 | ARIMA() | |
l1_node06 | ARIMA() | |
l2_node03 | l1_node05 | ARIMA() |
If the data is both hierarchical and multivariate, and the forecaster cannot genuinely deal with either, the forecasters_
attribute will have both column indices, for variables, and row indices, for instances, with forecasters fitted per instance and variable:
from aeon.forecasting.arima import ARIMA
from aeon.testing.utils.data_gen import _make_hierarchical
y = _make_hierarchical(n_columns=2)
fh = [1, 2, 3]
forecaster = ARIMA()
forecaster.fit(y, fh=fh)
forecaster.forecasters_
Further details on hierarchical forecasting, including reduction, aggregation, reconciliation, are presented in the "hierarchical forecasting" tutorial.
It is good practice to evaluate statistical performance of a forecaster before deploying it, and regularly re-evaluate performance if in continuous deployment. The evaluation workflow for the basic batch forecasting task, as solved by the workflow in Section 1.2, consists of comparing batch forecasts with actuals. This is sometimes called (batch-wise) backtesting.
The basic evaluation workflow is as follows:
NOTE: Step 5 (testing) is currently not supported in aeon
, but is on the development roadmap. For the time being, it is advised to use custom implementations of appropriate methods (e.g., Diebold-Mariano test; stationary confidence intervals).
NOTE: Note that this evaluation set-up determines how well a given algorithm would have performed on past data. Results are only insofar representative as future performance can be assumed to mirror past performance. This can be argued under certain assumptions (e.g., stationarity), but will in general be false. Monitoring of forecasting performance is hence advised in case an algorithm is applied multiple times.
Example: In the example, we will us the same airline data as in Section 1.2. But, instead of predicting the next 3 years, we hold out the last 3 years of the airline data (below: y_test
), and see how the forecaster would have performed three years ago, when asked to forecast the most recent 3 years (below: y_pred
), from the years before (below: y_train
). "how" is measured by a quantitative performance metric (below: mean_absolute_percentage_error
). This is then considered as an indication of how well the forecaster would perform in the coming 3 years (what was done in Section 1.2). This may or may not be a stretch depending on statistical assumptions and data properties (caution: it often is a stretch - past performance is in general not indicative of future performance).
from aeon.forecasting.model_selection import temporal_train_test_split
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
# we will try to forecast y_test from y_train
# plotting for illustration
plot_series(y_train, y_test, labels=["y_train", "y_test"])
print(y_train.shape[0], y_test.shape[0])
108 36
This is almost verbatim the workflow in Section 1.2, using y_train
to predict the indices of y_test
.
# we can simply take the indices from `y_test` where they already are stored
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = NaiveForecaster(strategy="last", sp=12)
forecaster.fit(y_train)
# y_pred will contain the predictions
y_pred = forecaster.predict(fh)
# plotting for illustration
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
The next step is to specify a forecasting metric. These are functions that return a
number when input with prediction and actual series. They are different from
sklearn
metrics in that they can accept series with indices rather than np .ndarray
. Forecasting metrics can be invoked using the functions in aeon.benchmarking .forecasting
e.g.,mean_absolute_percentage_error
or mean_squared_error
.
from aeon.performance_metrics.forecasting import mean_absolute_percentage_error
# option 1: using the lean function interface
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
# note: the FIRST argument is the ground truth, the SECOND argument are the forecasts
# the order matters for most metrics in general
0.13189432350948402
In general, forecast performances should be quantitatively tested against benchmark performances.
For convenience, we present the basic batch forecast evaluation workflow in one cell. This cell is using the lean function metric interface.
from aeon.datasets import load_airline
from aeon.forecasting.base import ForecastingHorizon
from aeon.forecasting.model_selection import temporal_train_test_split
from aeon.forecasting.naive import NaiveForecaster
from aeon.performance_metrics.forecasting import mean_absolute_percentage_error
# step 1: splitting historical data
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
# step 2: running the basic forecasting workflow
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = NaiveForecaster(strategy="last", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
# step 3: specifying the evaluation metric and
# step 4: computing the forecast performance
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
# step 5: testing forecast performance against baseline
# under development
0.13189432350948402
For convenience, we present the basic batch forecast evaluation workflow in one cell. This cell is using the advanced class specification interface for metrics.
from aeon.datasets import load_airline
from aeon.forecasting.base import ForecastingHorizon
from aeon.forecasting.model_selection import temporal_train_test_split
from aeon.forecasting.naive import NaiveForecaster
from aeon.performance_metrics.forecasting import mean_absolute_percentage_error
# step 1: splitting historical data
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
# step 2: running the basic forecasting workflow
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = NaiveForecaster(strategy="last", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
# step 3: computing the forecast performance for a given metric
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
# step 5: testing forecast performance against baseline
# under development
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[1], line 2 1 # step 1: splitting historical data ----> 2 y = load_airline() 3 y_train, y_test = temporal_train_test_split(y, test_size=36) 5 # step 2: running the basic forecasting workflow NameError: name 'load_airline' is not defined
A common use case requires the forecaster to regularly update with new data and make forecasts on a rolling basis. This is especially useful if the same kind of forecast has to be made at regular time points, e.g., daily or weekly. aeon
forecasters support this type of deployment workflow via the update
and update_predict
methods.
update
method¶The update
method can be called when a forecaster is already fitted, to ingest new data and make updated forecasts - this is referred to as an "update step".
After the update, the forecaster's internal "now" state (the cutoff
) is set to the latest time stamp seen in the update batch (assumed to be later than previously seen data).
The general pattern is as follows:
fit
predict
update
to ingest new datapredict
for the updated dataExample: suppose that, in the airline example, we want to make forecasts a year ahead, but every month, starting December 1957. The first few months, forecasts would be made as follows:
from aeon.datasets import load_airline
from aeon.forecasting.ets import AutoETS
from aeon.visualisation import plot_series
# we prepare the full data set for convenience
# note that in the scenario we will "know" only part of this at certain time points
y = load_airline()
# December 1957
# this is the data known in December 1957
y_1957Dec = y[:-36]
# step 1: specifying the forecasting strategy
forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
# step 2: specifying the forecasting horizon: one year ahead, all months
fh = np.arange(1, 13)
# step 3: this is the first time we use the model, so we fit it
forecaster.fit(y_1957Dec)
# step 4: obtaining the first batch of forecasts for Jan 1958 - Dec 1958
y_pred_1957Dec = forecaster.predict(fh)
# plotting predictions and past data
plot_series(y_1957Dec, y_pred_1957Dec, labels=["y_1957Dec", "y_pred_1957Dec"])
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
# January 1958
# new data is observed:
y_1958Jan = y[[-36]]
# step 5: we update the forecaster with the new data
forecaster.update(y_1958Jan)
# step 6: making forecasts with the updated data
y_pred_1958Jan = forecaster.predict(fh)
# note that the fh is relative, so forecasts are automatically for 1 month later
# i.e., from Feb 1958 to Jan 1959
y_pred_1958Jan
1958-02 341.515812 1958-03 392.849891 1958-04 378.521584 1958-05 375.662009 1958-06 426.011081 1958-07 470.577777 1958-08 467.110706 1958-09 414.461454 1958-10 360.966279 1958-11 315.212373 1958-12 357.909015 1959-01 363.046383 Freq: M, dtype: float64
# plotting predictions and past data
plot_series(
y[:-35],
y_pred_1957Dec,
y_pred_1958Jan,
labels=["y_1957Dec", "y_pred_1957Dec", "y_pred_1958Jan"],
)
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
# February 1958
# new data is observed:
y_1958Feb = y[[-35]]
# step 5: we update the forecaster with the new data
forecaster.update(y_1958Feb)
# step 6: making forecasts with the updated data
y_pred_1958Feb = forecaster.predict(fh)
# plotting predictions and past data
plot_series(
y[:-35],
y_pred_1957Dec,
y_pred_1958Jan,
y_pred_1958Feb,
labels=["y_1957Dec", "y_pred_1957Dec", "y_pred_1958Jan", "y_pred_1958Feb"],
)
(<Figure size 1600x400 with 1 Axes>, <Axes: ylabel='Number of airline passengers'>)
... and so on.
A shorthand for running first update
and then predict
is update_predict_single
- for some algorithms, this may be more efficient than the separate calls to update
and predict
:
# March 1958
# new data is observed:
y_1958Mar = y[[-34]]
# step 5&6: update/predict in one step
forecaster.update_predict_single(y_1958Mar, fh=fh)
1958-04 349.166352 1958-05 346.925002 1958-06 394.064152 1958-07 435.855204 1958-08 433.331082 1958-09 384.856436 1958-10 335.547256 1958-11 293.184412 1958-12 333.288639 1959-01 338.608873 1959-02 336.996809 1959-03 388.137360 Freq: M, dtype: float64
In the rolling deployment mode, may be useful to move the estimator's "now" state (the cutoff
) to later, for example if no new data was observed, but time has progressed; or, if computations take too long, and forecasts have to be queried.
The update
interface provides an option for this, via the update_params
argument of update
and other update funtions.
If update_params
is set to False
, no model update computations are performed; only data is stored, and the internal "now" state (the cutoff
) is set to the most recent date.
# April 1958
# new data is observed:
y_1958Apr = y[[-33]]
# step 5: perform an update without re-computing the model parameters
forecaster.update(y_1958Apr, update_params=False)
AutoETS(auto=True, n_jobs=-1, sp=12)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
AutoETS(auto=True, n_jobs=-1, sp=12)
aeon
can also simulate the update/predict deployment mode with a full batch of data.
This is not useful in deployment, as it requires all data to be available in advance; however, it is useful in playback, such as for simulations or model evaluation.
The update/predict playback mode can be called using update_predict
and a re-sampling constructor which encodes the precise walk-forward scheme.
# from aeon.datasets import load_airline
# from aeon.forecasting.ets import AutoETS
# from aeon.forecasting.model_selection import ExpandingWindowSplitter
# from aeon.visualisation import plot_series
NOTE: commented out - this part of the interface is currently undergoing a re-work. Contributions and PR are appreciated.
# for playback, the full data needs to be loaded in advance
# y = load_airline()
# step 1: specifying the forecasting strategy
# forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
# step 2: specifying the forecasting horizon
# fh - np.arange(1, 13)
# step 3: specifying the cross-validation scheme
# cv = ExpandingWindowSplitter()
# step 4: fitting the forecaster - fh should be passed here
# forecaster.fit(y[:-36], fh=fh)
# step 5: rollback
# y_preds = forecaster.update_predict(y, cv)
To evaluate forecasters with respect to their performance in rolling forecasting, the forecaster needs to be tested in a set-up mimicking rolling forecasting, usually on past data. Note that the batch back-testing as in Section 1.3 would not be an appropriate evaluation set-up for rolling deployment, as that tests only a single forecast batch.
The advanced evaluation workflow can be carried out using the evaluate
benchmarking function.
evalute
takes as arguments:
forecaster
to be evaluatedscikit-learn
re-sampling strategy for temporal splitting (cv
below), e.g., ExpandingWindowSplitter
or SlidingWindowSplitter
strategy
(string): whether the forecaster should be always be refitted or just fitted once and then updatedfrom aeon.forecasting.model_evaluation import evaluate
from aeon.forecasting.model_selection import ExpandingWindowSplitter
from aeon.forecasting.theta import ThetaForecaster
forecaster = ThetaForecaster(sp=12)
cv = ExpandingWindowSplitter(
step_length=12, fh=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], initial_window=72
)
df = evaluate(forecaster=forecaster, y=y, cv=cv, strategy="refit", return_data=True)
df.iloc[:, :5]
test_mean_absolute_percentage_error | fit_time | pred_time | len_train_window | cutoff | |
---|---|---|---|---|---|
0 | 0.083537 | 0.014647 | 0.005874 | 72 | 1954-12 |
1 | 0.047574 | 0.008137 | 0.005779 | 84 | 1955-12 |
2 | 0.059412 | 0.008143 | 0.005170 | 96 | 1956-12 |
3 | 0.040142 | 0.007612 | 0.005146 | 108 | 1957-12 |
4 | 0.101327 | 0.007556 | 0.005259 | 120 | 1958-12 |
5 | 0.051581 | 0.007802 | 0.005211 | 132 | 1959-12 |
# visualization of a forecaster evaluation
fig, ax = plot_series(
y,
df["y_pred"].iloc[0],
df["y_pred"].iloc[1],
df["y_pred"].iloc[2],
df["y_pred"].iloc[3],
df["y_pred"].iloc[4],
df["y_pred"].iloc[5],
markers=["o", "", "", "", "", "", ""],
labels=["y_true"] + ["y_pred (Backtest " + str(x) + ")" for x in range(6)],
)
ax.legend()
<matplotlib.legend.Legend at 0x29e2d1f9550>
Todo: performance metrics, averages, and testing - contributions to aeon
and the tutorial are welcome.
This section summarizes how to:
aeon
aeon
¶Generally, all forecasters available in aeon
can be listed with the all_estimators
command.
This will list all forecasters in aeon
, even those whose soft dependencies are not installed.
from aeon.registry import all_estimators
all_estimators("forecaster", as_dataframe=True)
name | estimator | |
---|---|---|
0 | ARDL | <class 'aeon.forecasting.ardl.ARDL'> |
1 | ARIMA | <class 'aeon.forecasting.arima.ARIMA'> |
2 | AutoARIMA | <class 'aeon.forecasting.arima.AutoARIMA'> |
3 | AutoETS | <class 'aeon.forecasting.ets.AutoETS'> |
4 | AutoEnsembleForecaster | <class 'aeon.forecasting.compose._ensemble.Aut... |
5 | BATS | <class 'aeon.forecasting.bats.BATS'> |
6 | BaggingForecaster | <class 'aeon.forecasting.compose._bagging.Bagg... |
7 | ColumnEnsembleForecaster | <class 'aeon.forecasting.compose._column_ensem... |
8 | ConformalIntervals | <class 'aeon.forecasting.conformal.ConformalIn... |
9 | Croston | <class 'aeon.forecasting.croston.Croston'> |
10 | DirRecTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.DirRe... |
11 | DirRecTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.DirRe... |
12 | DirectTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Direc... |
13 | DirectTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Direc... |
14 | DontUpdate | <class 'aeon.forecasting.stream._update.DontUp... |
15 | DynamicFactor | <class 'aeon.forecasting.dynamic_factor.Dynami... |
16 | EnsembleForecaster | <class 'aeon.forecasting.compose._ensemble.Ens... |
17 | ExponentialSmoothing | <class 'aeon.forecasting.exp_smoothing.Exponen... |
18 | ForecastByLevel | <class 'aeon.forecasting.compose._grouped.Fore... |
19 | ForecastX | <class 'aeon.forecasting.compose._pipeline.For... |
20 | ForecastingGridSearchCV | <class 'aeon.forecasting.model_selection._tune... |
21 | ForecastingPipeline | <class 'aeon.forecasting.compose._pipeline.For... |
22 | ForecastingRandomizedSearchCV | <class 'aeon.forecasting.model_selection._tune... |
23 | HierarchyEnsembleForecaster | <class 'aeon.forecasting.compose._hierarchy_en... |
24 | MockForecaster | <class 'aeon.testing.mock_estimators._mock_for... |
25 | MockUnivariateForecasterLogger | <class 'aeon.testing.mock_estimators._mock_for... |
26 | MultioutputTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Multi... |
27 | MultioutputTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Multi... |
28 | MultiplexForecaster | <class 'aeon.forecasting.compose._multiplexer.... |
29 | NaiveForecaster | <class 'aeon.forecasting.naive.NaiveForecaster'> |
30 | NaiveVariance | <class 'aeon.forecasting.naive.NaiveVariance'> |
31 | OnlineEnsembleForecaster | <class 'aeon.forecasting.online_learning._onli... |
32 | Permute | <class 'aeon.forecasting.compose._pipeline.Per... |
33 | PolynomialTrendForecaster | <class 'aeon.forecasting.trend.PolynomialTrend... |
34 | Prophet | <class 'aeon.forecasting.fbprophet.Prophet'> |
35 | ReconcilerForecaster | <class 'aeon.forecasting.reconcile.ReconcilerF... |
36 | RecursiveTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Recur... |
37 | RecursiveTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Recur... |
38 | SARIMAX | <class 'aeon.forecasting.sarimax.SARIMAX'> |
39 | STLForecaster | <class 'aeon.forecasting.trend.STLForecaster'> |
40 | SquaringResiduals | <class 'aeon.forecasting.squaring_residuals.Sq... |
41 | StackingForecaster | <class 'aeon.forecasting.compose._stack.Stacki... |
42 | StatsForecastAutoARIMA | <class 'aeon.forecasting.statsforecast.StatsFo... |
43 | TBATS | <class 'aeon.forecasting.tbats.TBATS'> |
44 | ThetaForecaster | <class 'aeon.forecasting.theta.ThetaForecaster'> |
45 | ThetaModularForecaster | <class 'aeon.forecasting.theta.ThetaModularFor... |
46 | TransformedTargetForecaster | <class 'aeon.forecasting.compose._pipeline.Tra... |
47 | TrendForecaster | <class 'aeon.forecasting.trend.TrendForecaster'> |
48 | UnobservedComponents | <class 'aeon.forecasting.structural.Unobserved... |
49 | UpdateEvery | <class 'aeon.forecasting.stream._update.Update... |
50 | UpdateRefitsEvery | <class 'aeon.forecasting.stream._update.Update... |
51 | VAR | <class 'aeon.forecasting.var.VAR'> |
52 | VARMAX | <class 'aeon.forecasting.varmax.VARMAX'> |
53 | VECM | <class 'aeon.forecasting.vecm.VECM'> |
The entries of the last column of the resulting dataframe are classes which could be directly used for construction, or simply inspected for the correct import path.
All forecasters aeon
have so-called tags which describe properties of the estimator, e.g., whether it is multivariate, probabilistic, or not. Use of tags, inspection, and retrieval will be described in this section.
Every forecaster has tags, which are key-value pairs that can describe capabilities or internal implementation details.
The most important "capability" style tags are the following:
requires-fh-in-fit
- a boolean. Whether the forecaster requires the forecasting horizon fh
already in fit
(True
), or whether it can be passed late in predict
(False
).
y_input_type
- a string. Whether the forecaster is univariate ("univariate"
), strictly multivariate ("multivariate"
), or can deal with any number of variables ("both"
).
capability:pred_int
- a boolean. Whether the forecaster can return probabilistic predictions via predict_interval
etc, see Section 1.5.
ignores-exogeneous-X
- a boolean. Whether the forecaster makes use of exogeneous variables X
(False
) or not (True
). If the forecaster does not use X
, it can still be passed for interface uniformity, and will be ignored.
capability:missing_values
- a boolean. Whether the forecaster can deal with missing data in the inputs X
or y
.
Tags of a forecaster instance can be inspected via the get_tags
(lists all tags) and get_tag
(gets value for one tag) methods.
Tag values may depend on hyper-parameter choices.
from aeon.forecasting.arima import ARIMA
ARIMA().get_tags()
{'y_input_type': 'univariate', 'ignores-exogeneous-X': False, 'capability:pred_int': True, 'capability:missing_values': True, 'y_inner_type': 'pd.Series', 'X_inner_type': 'pd.DataFrame', 'requires-fh-in-fit': False, 'X-y-must-have-same-index': True, 'enforce_index_type': None, 'fit_is_empty': False, 'python_version': None, 'python_dependencies': 'pmdarima'}
The y_inner_type
and X_inner_type
indicate whether the forecaster can deal with panel or hierarchical data natively - if an panel or hierarchical mtype occurs here, it does (see data types tutorial).
An explanation for all tags can be obtained using the all_tags
utility, see Section 2.2.3.
To list forecasters with their tags, the all_estimators
utility can be used with its return_tags
argument.
The resulting data frame can then be used for table queries or sub-setting.
from aeon.registry import all_estimators
all_estimators(
"forecaster", as_dataframe=True, return_tags=["y_input_type", "requires-fh-in-fit"]
)
name | estimator | y_input_type | requires-fh-in-fit | |
---|---|---|---|---|
0 | ARDL | <class 'aeon.forecasting.ardl.ARDL'> | univariate | False |
1 | ARIMA | <class 'aeon.forecasting.arima.ARIMA'> | univariate | False |
2 | AutoARIMA | <class 'aeon.forecasting.arima.AutoARIMA'> | univariate | False |
3 | AutoETS | <class 'aeon.forecasting.ets.AutoETS'> | univariate | False |
4 | AutoEnsembleForecaster | <class 'aeon.forecasting.compose._ensemble.Aut... | univariate | False |
5 | BATS | <class 'aeon.forecasting.bats.BATS'> | univariate | False |
6 | BaggingForecaster | <class 'aeon.forecasting.compose._bagging.Bagg... | univariate | False |
7 | ColumnEnsembleForecaster | <class 'aeon.forecasting.compose._column_ensem... | both | False |
8 | ConformalIntervals | <class 'aeon.forecasting.conformal.ConformalIn... | univariate | False |
9 | Croston | <class 'aeon.forecasting.croston.Croston'> | univariate | False |
10 | DirRecTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.DirRe... | univariate | True |
11 | DirRecTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.DirRe... | univariate | True |
12 | DirectTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Direc... | univariate | True |
13 | DirectTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Direc... | univariate | True |
14 | DontUpdate | <class 'aeon.forecasting.stream._update.DontUp... | univariate | False |
15 | DynamicFactor | <class 'aeon.forecasting.dynamic_factor.Dynami... | multivariate | False |
16 | EnsembleForecaster | <class 'aeon.forecasting.compose._ensemble.Ens... | both | False |
17 | ExponentialSmoothing | <class 'aeon.forecasting.exp_smoothing.Exponen... | univariate | False |
18 | ForecastByLevel | <class 'aeon.forecasting.compose._grouped.Fore... | both | False |
19 | ForecastX | <class 'aeon.forecasting.compose._pipeline.For... | univariate | True |
20 | ForecastingGridSearchCV | <class 'aeon.forecasting.model_selection._tune... | both | False |
21 | ForecastingPipeline | <class 'aeon.forecasting.compose._pipeline.For... | both | False |
22 | ForecastingRandomizedSearchCV | <class 'aeon.forecasting.model_selection._tune... | both | False |
23 | HierarchyEnsembleForecaster | <class 'aeon.forecasting.compose._hierarchy_en... | both | False |
24 | MockForecaster | <class 'aeon.testing.mock_estimators._mock_for... | both | False |
25 | MockUnivariateForecasterLogger | <class 'aeon.testing.mock_estimators._mock_for... | univariate | False |
26 | MultioutputTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Multi... | univariate | True |
27 | MultioutputTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Multi... | univariate | True |
28 | MultiplexForecaster | <class 'aeon.forecasting.compose._multiplexer.... | both | False |
29 | NaiveForecaster | <class 'aeon.forecasting.naive.NaiveForecaster'> | univariate | False |
30 | NaiveVariance | <class 'aeon.forecasting.naive.NaiveVariance'> | univariate | False |
31 | OnlineEnsembleForecaster | <class 'aeon.forecasting.online_learning._onli... | univariate | False |
32 | Permute | <class 'aeon.forecasting.compose._pipeline.Per... | both | False |
33 | PolynomialTrendForecaster | <class 'aeon.forecasting.trend.PolynomialTrend... | univariate | False |
34 | Prophet | <class 'aeon.forecasting.fbprophet.Prophet'> | univariate | False |
35 | ReconcilerForecaster | <class 'aeon.forecasting.reconcile.ReconcilerF... | univariate | False |
36 | RecursiveTabularRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Recur... | univariate | False |
37 | RecursiveTimeSeriesRegressionForecaster | <class 'aeon.forecasting.compose._reduce.Recur... | univariate | False |
38 | SARIMAX | <class 'aeon.forecasting.sarimax.SARIMAX'> | univariate | False |
39 | STLForecaster | <class 'aeon.forecasting.trend.STLForecaster'> | univariate | False |
40 | SquaringResiduals | <class 'aeon.forecasting.squaring_residuals.Sq... | univariate | True |
41 | StackingForecaster | <class 'aeon.forecasting.compose._stack.Stacki... | univariate | True |
42 | StatsForecastAutoARIMA | <class 'aeon.forecasting.statsforecast.StatsFo... | univariate | False |
43 | TBATS | <class 'aeon.forecasting.tbats.TBATS'> | univariate | False |
44 | ThetaForecaster | <class 'aeon.forecasting.theta.ThetaForecaster'> | univariate | False |
45 | ThetaModularForecaster | <class 'aeon.forecasting.theta.ThetaModularFor... | univariate | False |
46 | TransformedTargetForecaster | <class 'aeon.forecasting.compose._pipeline.Tra... | both | False |
47 | TrendForecaster | <class 'aeon.forecasting.trend.TrendForecaster'> | univariate | False |
48 | UnobservedComponents | <class 'aeon.forecasting.structural.Unobserved... | univariate | False |
49 | UpdateEvery | <class 'aeon.forecasting.stream._update.Update... | univariate | False |
50 | UpdateRefitsEvery | <class 'aeon.forecasting.stream._update.Update... | univariate | False |
51 | VAR | <class 'aeon.forecasting.var.VAR'> | multivariate | False |
52 | VARMAX | <class 'aeon.forecasting.varmax.VARMAX'> | multivariate | False |
53 | VECM | <class 'aeon.forecasting.vecm.VECM'> | multivariate | False |
To filter beforehand on certain tags and tag values, the filter_tags
argument can be used:
# this lists all forecasters that can deal with multivariate data
all_estimators(
"forecaster",
as_dataframe=True,
filter_tags={"y_input_type": ["multivariate", "both"]},
)
name | estimator | |
---|---|---|
0 | ColumnEnsembleForecaster | <class 'aeon.forecasting.compose._column_ensem... |
1 | DynamicFactor | <class 'aeon.forecasting.dynamic_factor.Dynami... |
2 | EnsembleForecaster | <class 'aeon.forecasting.compose._ensemble.Ens... |
3 | ForecastByLevel | <class 'aeon.forecasting.compose._grouped.Fore... |
4 | ForecastingGridSearchCV | <class 'aeon.forecasting.model_selection._tune... |
5 | ForecastingPipeline | <class 'aeon.forecasting.compose._pipeline.For... |
6 | ForecastingRandomizedSearchCV | <class 'aeon.forecasting.model_selection._tune... |
7 | HierarchyEnsembleForecaster | <class 'aeon.forecasting.compose._hierarchy_en... |
8 | MockForecaster | <class 'aeon.testing.mock_estimators._mock_for... |
9 | MultiplexForecaster | <class 'aeon.forecasting.compose._multiplexer.... |
10 | Permute | <class 'aeon.forecasting.compose._pipeline.Per... |
11 | TransformedTargetForecaster | <class 'aeon.forecasting.compose._pipeline.Tra... |
12 | VAR | <class 'aeon.forecasting.var.VAR'> |
13 | VARMAX | <class 'aeon.forecasting.varmax.VARMAX'> |
14 | VECM | <class 'aeon.forecasting.vecm.VECM'> |
Important note: as said above, tag values can depend on hyper-parameter settings, e.g., a ForecastingPipeline
can handle multivariate data only if the forecaster in it can handle multivariate data.
In retrieval as above, the tags for a class are usually set to indicate the most general potential value, e.g., if for some parameter choice the estimator can handle multivariate, it will appear on the list.
To list all forecaster tags with an explanation of the tag, the all_tags
utility can be used:
import pandas as pd
from aeon.registry import all_tags
# wrapping this in a pandas DataFrame for pretty display
pd.DataFrame(all_tags(estimator_types="forecaster"))[[0, 3]]
0 | 3 | |
---|---|---|
0 | X-y-must-have-same-index | do X/y in fit/update and X/fh in predict have ... |
1 | X_inner_type | which data structure is the internal _fit/_pre... |
2 | capability:pred_int | does the forecaster implement predict_interval... |
3 | capability:pred_var | does the forecaster implement predict_variance? |
4 | enforce_index_type | passed to input checks, input conversion index... |
5 | ignores-exogeneous-X | does forecaster ignore exogeneous data (X)? |
6 | remember_data | whether estimator remembers all data seen as s... |
7 | requires-fh-in-fit | does forecaster require fh passed already in f... |
8 | y_inner_type | which data structure is the internal _fit/_pre... |
9 | y_input_type | which series type does the forecaster support?... |
aeon
¶aeon
supports a number of commonly used forecasters, many of them interfaced from state-of-art forecasting packages. All forecasters are available under the unified aeon
interface.
Some classes that are currently stably supported are:
ExponentialSmoothing
, ThetaForecaster
, and autoETS
from statsmodels
ARIMA
and AutoARIMA
from pmdarima
AutoARIMA
from statsforecast
BATS
and TBATS
from tbats
PolynomialTrend
for forecasting polynomial trendsProphet
which interfaces Facebook prophet
This is not the full list, use all_estimators
as demonstrated in Sections 2.1 and 2.2 for that.
For illustration, all estimators below will be presented on the basic forecasting workflow - though they also support the advanced forecasting and evaluation workflows under the unified aeon
interface (see Section 1).
For use in the other workflows, simply replace the "forecaster specification block" ("forecaster=
") by the forecaster specification block in the examples presented below.
# imports necessary for this chapter
from aeon.datasets import load_airline
from aeon.forecasting.base import ForecastingHorizon
from aeon.forecasting.model_selection import temporal_train_test_split
from aeon.performance_metrics.forecasting import mean_absolute_percentage_error
from aeon.visualisation import plot_series
# data loading for illustration (see section 1 for explanation)
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
fh = ForecastingHorizon(y_test.index, is_relative=False)
statsmodels
¶aeon
interfaces a number of statistical forecasting algorithms from statsmodels
: exponential smoothing, theta, and auto-ETS.
For example, to use exponential smoothing with an additive trend component and multiplicative seasonality on the airline data set, we can write the following. Note that since this is monthly data, a good choic for seasonal periodicity (sp) is 12 (= hypothesized periodicity of a year).
from aeon.forecasting.exp_smoothing import ExponentialSmoothing
forecaster = ExponentialSmoothing(trend="add", seasonal="additive", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.0511419028267833
The exponential smoothing of state space model can also be automated similar
to the ets function in R. This is implemented in the AutoETS
forecaster.
from aeon.forecasting.ets import AutoETS
forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.06186354612548242
aeon
interfaces pmdarima
for its ARIMA class models.
For a classical ARIMA model with set parameters, use the ARIMA
forecaster:
from aeon.forecasting.arima import ARIMA
forecaster = ARIMA(
order=(1, 1, 0), seasonal_order=(0, 1, 0, 12), suppress_warnings=True
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.04356744949188808
AutoARIMA
is an automatically tuned ARIMA
variant that obtains the optimal pdq parameters automatically:
from aeon.forecasting.arima import AutoARIMA
forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.04148971434138202
forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_pred, y_test)
0.04093675930591672
# to obtain the fitted parameters, run
forecaster.get_fitted_params()
# should these not include pdq?
{'ar.L1': -0.2411177593160947, 'sigma2': 92.74985957133522, 'order': (1, 1, 0), 'seasonal_order': (0, 1, 0, 12), 'aic': 704.0011679025909, 'aicc': 704.1316026851996, 'bic': 709.108921685792, 'hqic': 706.0650836395923}
from aeon.forecasting.bats import BATS
forecaster = BATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.08185559025534275
from aeon.forecasting.tbats import TBATS
forecaster = TBATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.08024090844021753
from aeon.forecasting.fbprophet import Prophet
The current interface does not support period indices, only pd.DatetimeIndex. Consider improving this by contributing the aeon
.
# Convert index to pd.DatetimeIndex
z = y.copy()
z = z.to_timestamp(freq="M")
z_train, z_test = temporal_train_test_split(z, test_size=36)
forecaster = Prophet(
seasonality_mode="multiplicative",
n_changepoints=int(len(y_train) / 12),
add_country_holidays={"country_name": "Germany"},
yearly_seasonality=True,
weekly_seasonality=False,
daily_seasonality=False,
)
forecaster.fit(z_train)
y_pred = forecaster.predict(fh.to_relative(cutoff=y_train.index[-1]))
y_pred.index = y_test.index
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
16:52:01 - cmdstanpy - INFO - Chain [1] start processing 16:52:01 - cmdstanpy - INFO - Chain [1] done processing
0.0714009805178252
We can also use the UnobservedComponents class from statsmodels to generate predictions using a state space model.
from aeon.forecasting.structural import UnobservedComponents
# We can model seasonality using Fourier modes as in the Prophet model.
forecaster = UnobservedComponents(
level="local linear trend", freq_seasonal=[{"period": 12, "harmonics": 10}]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.049737049410244434
aeon
supports a number of advanced composition patterns to create forecasters out of simpler components:
scikit-learn
regressors. A common example is feature/label tabulation by rolling window, aka the "direct reduction strategy".For illustration, all estimators below will be presented on the basic forecasting workflow - though they also support the advanced forecasting and evaluation workflows under the unified aeon
interface (see Section 1).
For use in the other workflows, simply replace the "forecaster specification block" ("forecaster=
") by the forecaster specification block in the examples presented below.
# imports necessary for this chapter
from aeon.datasets import load_airline
from aeon.forecasting.base import ForecastingHorizon
from aeon.forecasting.model_selection import temporal_train_test_split
from aeon.performance_metrics.forecasting import mean_absolute_percentage_error
from aeon.visualisation import plot_series
# data loading for illustration (see section 1 for explanation)
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
fh = ForecastingHorizon(y_test.index, is_relative=False)
aeon
provides a meta-estimator that allows the use of any scikit-learn
estimator for forecasting.
Example: we will define a tabulation reduction strategy to convert a k-nearest neighbors regressor (sklearn
KNeighborsRegressor
) into a forecaster. The composite algorithm is an object compliant with the aeon
forecaster interface (picture: big robot), and contains the regressor as a parameter accessible component (picture: little robot). In fit
, the composite algorithm uses a sliding window strategy to tabulate the data, and fit the regressor to the tabulated data (picture: left half). In predict
, the composite algorithm presents the regressor with the last observed window to obtain predictions (picture: right half).
Below, the composite is constructed using the shorthand function make_reduction
which produces a aeon
estimator of forecaster scitype. It is called with a constructed scikit-learn
regressor, regressor
, and additional parameter which can be later tuned as hyper-parameters
from sklearn.neighbors import KNeighborsRegressor
from aeon.forecasting.compose import make_reduction
regressor = KNeighborsRegressor(n_neighbors=1)
forecaster = make_reduction(regressor, window_length=15, strategy="recursive")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.12887507224382988
In the above example we use the "recursive" reduction strategy. Other implemented strategies are:
Parameters can be inspected using scikit-learn
compatible get_params
functionality (and set using set_params
). This provides tunable and nested access to parameters of the KNeighborsRegressor
(as estimator_etc
), and the window_length
of the reduction strategy. Note that the strategy
is not accessible, as underneath the utility function this is mapped on separate algorithm classes. For tuning over algorithms, see the "autoML" section below.
forecaster.get_params()
{'estimator__algorithm': 'auto', 'estimator__leaf_size': 30, 'estimator__metric': 'minkowski', 'estimator__metric_params': None, 'estimator__n_jobs': None, 'estimator__n_neighbors': 1, 'estimator__p': 2, 'estimator__weights': 'uniform', 'estimator': KNeighborsRegressor(n_neighbors=1), 'pooling': 'local', 'transformers': None, 'window_length': 15}
A common composition motif is pipelining: for example, first deseasonalizing or detrending the data, then forecasting the detrended/deseasonalized series. When forecasting, one needs to add the trend and seasonal component back to the data.
aeon
provides a generic pipeline object for this kind of composite modelling, the TransforemdTargetForecaster
. It chains an arbitrary number of transformations with a forecaster. The transformations can either be pre-processing transformations or a post-processing transformations. An example of a forecaster with pre-processing
transformations can be seen below.
from aeon.forecasting.arima import ARIMA
from aeon.forecasting.compose import TransformedTargetForecaster
from aeon.transformations.detrend import Deseasonalizer
forecaster = TransformedTargetForecaster(
[
("deseasonalize", Deseasonalizer(model="multiplicative", sp=12)),
("forecast", ARIMA()),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.1396997352629755
In the above example, the TransformedTargetForecaster
is constructed with a list of steps, each a pair of name and estimator, where the last estimator is a forecaster scitype. The pre-processing transformers should be series-to-series transformers which possess both a transform
and an inverse_transform
method. The resulting estimator is of forecaster scitype and has all interface defining methods. In fit
, all transformers apply fit_transforms
to the data, then the forecaster's fit
; in predict
, first the forecaster's predict
is applied, then the transformers' inverse_transform
in reverse order.
The same pipeline, as above, can also be constructed with the multiplication dunder method *
.
This creates a TransformedTargetForecaster
as above, with components given default names.
forecaster = Deseasonalizer(model="multiplicative", sp=12) * ARIMA()
forecaster
TransformedTargetForecaster(steps=[Deseasonalizer(model='multiplicative', sp=12), ARIMA()])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
TransformedTargetForecaster(steps=[Deseasonalizer(model='multiplicative', sp=12), ARIMA()])
The names in a dunder constructed pipeline are made unique in case, e.g., two deseasonalizers are used.
Example of a multiple seasonality model:
forecaster = (
Deseasonalizer(model="multiplicative", sp=12)
* Deseasonalizer(model="multiplicative", sp=3)
* ARIMA()
)
forecaster.get_params()
{'steps': [Deseasonalizer(model='multiplicative', sp=12), Deseasonalizer(model='multiplicative', sp=3), ARIMA()], 'Deseasonalizer_1': Deseasonalizer(model='multiplicative', sp=12), 'Deseasonalizer_2': Deseasonalizer(model='multiplicative', sp=3), 'ARIMA': ARIMA(), 'Deseasonalizer_1__model': 'multiplicative', 'Deseasonalizer_1__sp': 12, 'Deseasonalizer_2__model': 'multiplicative', 'Deseasonalizer_2__sp': 3, 'ARIMA__concentrate_scale': False, 'ARIMA__enforce_invertibility': True, 'ARIMA__enforce_stationarity': True, 'ARIMA__hamilton_representation': False, 'ARIMA__maxiter': 50, 'ARIMA__measurement_error': False, 'ARIMA__method': 'lbfgs', 'ARIMA__mle_regression': True, 'ARIMA__order': (1, 0, 0), 'ARIMA__out_of_sample_size': 0, 'ARIMA__scoring': 'mse', 'ARIMA__scoring_args': None, 'ARIMA__seasonal_order': (0, 0, 0, 0), 'ARIMA__simple_differencing': False, 'ARIMA__start_params': None, 'ARIMA__suppress_warnings': False, 'ARIMA__time_varying_regression': False, 'ARIMA__trend': None, 'ARIMA__with_intercept': True}
We can also create a pipeline with post-processing transformations, these are transformations after the forecaster, in a dunder pipeline or a TransformedTargetForecaster
.
Below is an example of a multiple seasonality model, with integer rounding post-processing of the predictions:
from aeon.transformations.func_transform import FunctionTransformer
forecaster = ARIMA() * FunctionTransformer(lambda y: y.round())
forecaster.fit_predict(y, fh=fh).head(3)
1958-01 334.0 1958-02 338.0 1958-03 317.0 Freq: M, dtype: float64
Both pre- and post-processing transformers can be present, in this case the post-processing transformations will be applied after the inverse-transform
of the pre-processing ones.
forecaster = (
Deseasonalizer(model="multiplicative", sp=12)
* Deseasonalizer(model="multiplicative", sp=3)
* ARIMA()
* FunctionTransformer(lambda y: y.round())
)
forecaster.fit_predict(y_train, fh=fh).head(3)
Period 1958-01 339.0 1958-02 334.0 1958-03 381.0 Freq: M, dtype: float64
Detrender
as pipeline component¶For detrending, we can use the Detrender
. This is an estimator of series-to-transformer scitype that wraps an arbitrary forecaster. For example, for linear detrending, we can use PolynomialTrendForecaster
to fit a linear trend, and then subtract/add it using the Detrender
transformer inside TransformedTargetForecaster
.
To understand better what happens, we first examine the detrender separately:
from aeon.forecasting.trend import PolynomialTrendForecaster
from aeon.transformations.detrend import Detrender
# linear detrending
forecaster = PolynomialTrendForecaster(degree=1)
transformer = Detrender(forecaster=forecaster)
yt = transformer.fit_transform(y_train)
# internally, the Detrender uses the in-sample predictions
# of the PolynomialTrendForecaster
forecaster = PolynomialTrendForecaster(degree=1)
fh_ins = -np.arange(len(y_train)) # in-sample forecasting horizon
y_pred = forecaster.fit(y_train).predict(fh=fh_ins)
plot_series(y_train, y_pred, yt, labels=["y_train", "fitted linear trend", "residuals"]);
Since the Detrender
is of scitype series-to-series-transformer, it can be used in the TransformedTargetForecaster
for detrending any forecaster:
forecaster = TransformedTargetForecaster(
[
("deseasonalize", Deseasonalizer(model="multiplicative", sp=12)),
("detrend", Detrender(forecaster=PolynomialTrendForecaster(degree=1))),
("forecast", ARIMA()),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.0561016822006403
aeon
follows the scikit-learn
philosophy of composability and nested parameter inspection. As long as an estimator has the right scitype, it can be used as part of any composition principle requiring that scitype. Above, we have already seen the example of a forecaster inside a Detrender
, which is an estimator of scitype series-to-series-transformer, with one component of forecaster scitype. Similarly, in a TransformedTargetForecaster
, we can use the reduction composite from Section 3.1 as the last forecaster element in the pipeline, which inside has an estimator of tabular regressor scitype, the KNeighborsRegressor
:
from sklearn.neighbors import KNeighborsRegressor
from aeon.forecasting.compose import make_reduction
forecaster = TransformedTargetForecaster(
[
("deseasonalize", Deseasonalizer(model="multiplicative", sp=12)),
("detrend", Detrender(forecaster=PolynomialTrendForecaster(degree=1))),
(
"forecast",
make_reduction(
KNeighborsRegressor(),
window_length=15,
strategy="recursive",
),
),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.05870838788931646
As with scikit-learn
models, we can inspect and access parameters of any component via get_params
and set_params
:
forecaster.get_params()
{'steps': [('deseasonalize', Deseasonalizer(model='multiplicative', sp=12)), ('detrend', Detrender(forecaster=PolynomialTrendForecaster())), ('forecast', RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(), window_length=15))], 'deseasonalize': Deseasonalizer(model='multiplicative', sp=12), 'detrend': Detrender(forecaster=PolynomialTrendForecaster()), 'forecast': RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(), window_length=15), 'deseasonalize__model': 'multiplicative', 'deseasonalize__sp': 12, 'detrend__forecaster__degree': 1, 'detrend__forecaster__regressor': None, 'detrend__forecaster__with_intercept': True, 'detrend__forecaster': PolynomialTrendForecaster(), 'detrend__model': 'additive', 'forecast__estimator__algorithm': 'auto', 'forecast__estimator__leaf_size': 30, 'forecast__estimator__metric': 'minkowski', 'forecast__estimator__metric_params': None, 'forecast__estimator__n_jobs': None, 'forecast__estimator__n_neighbors': 5, 'forecast__estimator__p': 2, 'forecast__estimator__weights': 'uniform', 'forecast__estimator': KNeighborsRegressor(), 'forecast__pooling': 'local', 'forecast__transformers': None, 'forecast__window_length': 15}
aeon
provides parameter tuning strategies as compositors of forecaster scitype, similar to scikit-learn
's GridSearchCV
.
ForecastingGridSearchCV
¶The compositor ForecastingGridSearchCV
(and other tuners) are constructed with a forecaster to tune, a cross-validation constructor, a scikit-learn
parameter grid, and parameters specific to the tuning strategy. Cross-validation constructors follow the scikit-learn
interface for re-samplers, and can be slotted in exchangeably.
As an example, we show tuning of the window length in the reduction compositor from Section 3.1, using temporal sliding window tuning:
from sklearn.neighbors import KNeighborsRegressor
from aeon.forecasting.compose import make_reduction
from aeon.forecasting.model_selection import (
ForecastingGridSearchCV,
SlidingWindowSplitter,
)
regressor = KNeighborsRegressor()
forecaster = make_reduction(regressor, window_length=15, strategy="recursive")
param_grid = {"window_length": [7, 12, 15]}
# We fit the forecaster on an initial window which is 80% of the historical data
# then use temporal sliding window cross-validation to find the optimal hyper-parameters
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.8), window_length=20)
gscv = ForecastingGridSearchCV(
forecaster, strategy="refit", cv=cv, param_grid=param_grid
)
As with other composites, the resulting forecaster provides the unified interface of aeon
forecasters - window splitting, tuning, etc requires no manual effort and is done behind the unified interface:
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.16607972017556033
Tuned parameters can be accessed in the best_params_
attribute:
gscv.best_params_
{'window_length': 7}
An instance of the best forecaster, with hyper-parameters set, can be retrieved by accessing the best_forecaster_
attribute:
gscv.best_forecaster_
RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(), window_length=7)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(), window_length=7)
KNeighborsRegressor()
KNeighborsRegressor()
As in scikit-learn
, parameters of nested components can be tuned by accessing their get_params
key - by default this is [estimatorname]__[parametername]
if [estimatorname]
is the name of the component, and [parametername]
the name of a parameter within the estimator [estimatorname]
.
For example, below we tune the KNeighborsRegressor
component's n_neighbors
, in addition to tuning window_length
. The tuneable parameters can easily be queried using forecaster.get_params()
.
from sklearn.neighbors import KNeighborsRegressor
from aeon.forecasting.compose import make_reduction
from aeon.forecasting.model_selection import (
ForecastingGridSearchCV,
SlidingWindowSplitter,
)
param_grid = {"window_length": [7, 12, 15], "estimator__n_neighbors": np.arange(1, 10)}
regressor = KNeighborsRegressor()
forecaster = make_reduction(regressor, strategy="recursive")
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.8), window_length=30)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.13988948769413537
gscv.best_params_
{'estimator__n_neighbors': 2, 'window_length': 12}
An alternative to the above is tuning the regressor separately, using scikit-learn
's GridSearchCV
and a separate parameter grid. As this does not use the "overall" performance metric to tune the inner regressor, performance of the composite forecaster may vary.
from sklearn.model_selection import GridSearchCV
# tuning the 'n_estimator' hyperparameter of RandomForestRegressor from scikit-learn
regressor_param_grid = {"n_neighbors": np.arange(1, 10)}
forecaster_param_grid = {"window_length": [7, 12, 15]}
# create a tunnable regressor with GridSearchCV
regressor = GridSearchCV(KNeighborsRegressor(), param_grid=regressor_param_grid)
forecaster = make_reduction(regressor, strategy="recursive")
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.8), window_length=30)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.14493362646957736
NOTE: a smart implementation of this would use caching to save partial results from the inner tuning and reduce runtime substantially - currently aeon
does not support this. Consider helping to improve aeon
.
All tuning algorithms in aeon
allow the user to set a score; for forecasting the default is mean absolute percentage error. The score can be set using the score
argument, to any scorer function or class, as in Section 1.3.
Re-sampling tuners retain performances on individual forecast re-sample folds, which can be retrieved from the cv_results_
argument after the forecaster has been fit via a call to fit
.
In the above example, using the mean squared error instead of the mean absolute percentage error for tuning would be done by defining the forecaster as follows:
from aeon.performance_metrics.forecasting import mean_squared_error
mse = mean_squared_error
param_grid = {"window_length": [7, 12, 15]}
regressor = KNeighborsRegressor()
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.8), window_length=30)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=param_grid, scoring=mse)
The performances on individual folds can be accessed as follows, after fitting:
gscv.fit(y_train)
gscv.cv_results_
mean_test_mean_squared_error | mean_fit_time | mean_pred_time | params | rank_test_mean_squared_error | |
---|---|---|---|---|---|
0 | 2600.750255 | 0.072074 | 0.002209 | {'window_length': 7} | 3.0 |
1 | 1134.999053 | 0.097644 | 0.002344 | {'window_length': 12} | 1.0 |
2 | 1285.133614 | 0.117036 | 0.003169 | {'window_length': 15} | 2.0 |
aeon
provides a number of compositors for ensembling and automated model selection. In contrast to tuning, which uses data-driven strategies to find optimal hyper-parameters for a fixed forecaster, the strategies in this section combine or select on the level of estimators, using a collection of forecasters to combine or select from.
The strategies discussed in this section are:
The most flexible way to perform model selection over forecasters is by using the MultiplexForecaster
, which exposes the choice of a forecaster from a list as a hyper-parameter that is tunable by generic hyper-parameter tuning strategies such as in Section 3.3.
In isolation, MultiplexForecaster
is constructed with a named list forecasters
, of forecasters. It has a single hyper-parameter, selected_forecaster
, which can be set to the name of any forecaster in forecasters
, and behaves exactly like the forecaster keyed in forecasters
by selected_forecaster
.
from aeon.forecasting.compose import MultiplexForecaster
from aeon.forecasting.exp_smoothing import ExponentialSmoothing
from aeon.forecasting.naive import NaiveForecaster
forecaster = MultiplexForecaster(
forecasters=[
("naive", NaiveForecaster(strategy="last")),
("ets", ExponentialSmoothing(trend="add", sp=12)),
],
)
forecaster.set_params(**{"selected_forecaster": "naive"})
# now forecaster behaves like NaiveForecaster(strategy="last")
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='naive')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='naive')
forecaster.set_params(**{"selected_forecaster": "ets"})
# now forecaster behaves like ExponentialSmoothing(trend="add", sp=12))
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='ets')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='ets')
The MultiplexForecaster
is not too useful in isolation, but allows for flexible autoML when combined with a tuning wrapper. The below defines a forecaster that selects one of NaiveForecaster
and ExponentialSmoothing
by sliding window tuning as in Section 3.3.
Combined with rolling use of the forecaster via the update
functionality (see Section 1.4), the tuned multiplexer can switch back and forth between NaiveForecaster
and ExponentialSmoothing
, depending on performance, as time progresses.
from aeon.forecasting.model_selection import (
ForecastingGridSearchCV,
SlidingWindowSplitter,
)
forecaster = MultiplexForecaster(
forecasters=[
("naive", NaiveForecaster(strategy="last")),
("ets", ExponentialSmoothing(trend="add", sp=12)),
]
)
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5), window_length=30)
forecaster_param_grid = {"selected_forecaster": ["ets", "naive"]}
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.19886711926999853
As with any tuned forecaster, best parameters and an instance of the tuned forecaster can be retrieved using best_params_
and best_forecaster_
:
gscv.best_params_
{'selected_forecaster': 'naive'}
gscv.best_forecaster_
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='naive')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MultiplexForecaster(forecasters=[('naive', NaiveForecaster()), ('ets', ExponentialSmoothing(sp=12, trend='add'))], selected_forecaster='naive')
OptimalPassthrough
¶aeon
also provides capabilities for automated selection of pipeline components inside a pipeline, i.e., pipeline structure. This is achieved with the OptionalPassthrough
transformer.
The OptionalPassthrough
transformer allows to tune whether a transformer inside a pipeline is applied to the data or not. For example, if we want to tune whether sklearn.StandardScaler
is bringing an advantage to the forecast or not, we wrap it in OptionalPassthrough
. Internally, OptionalPassthrough
has a hyperparameter passthrough: bool
that is tuneable; when False
the composite behaves like the wrapped transformer, when True
, it ignores the transformer within.
To make effective use of OptionalPasstrhough
, define a suitable parameter set using the __
(double underscore) notation familiar from scikit-learn
. This allows to access and tune attributes of nested objects like TabularToSeriesAdaptor(StandardScaler()). We can use __
multiple times if we have more than two levels of nesting.
In the following example, we take a deseasonalize/scale pipeline and tune over the four possible combinations of deseasonalizer and scaler being included in the pipeline yes/no (2 times 2 = 4); as well as over the forecaster's and the scaler's parameters.
Note: this could be arbitrarily combined with MultiplexForecaster
, as in Section 3.4.1, to select over pipeline architecture as well as over pipeline structure.
Note: scikit-learn
and aeon
do not support conditional parameter sets at current (unlike, e.g., the mlr3
package). This means that the grid search will optimize over the scaler
's parameters even when it is skipped. Designing/implementing this capability would be an interesting area for contributions or research.
from sklearn.preprocessing import StandardScaler
from aeon.forecasting.compose import TransformedTargetForecaster
from aeon.forecasting.model_selection import (
ForecastingGridSearchCV,
SlidingWindowSplitter,
)
from aeon.forecasting.naive import NaiveForecaster
from aeon.transformations.adapt import TabularToSeriesAdaptor
from aeon.transformations.compose import OptionalPassthrough
from aeon.transformations.detrend import Deseasonalizer
# create pipeline
pipe = TransformedTargetForecaster(
steps=[
("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
("scaler", OptionalPassthrough(TabularToSeriesAdaptor(StandardScaler()))),
("forecaster", NaiveForecaster()),
]
)
# putting it all together in a grid search
cv = SlidingWindowSplitter(
initial_window=60, window_length=24, start_with_window=True, step_length=24
)
param_grid = {
"deseasonalizer__passthrough": [True, False],
"scaler__transformer__transformer__with_mean": [True, False],
"scaler__passthrough": [True, False],
"forecaster__strategy": ["drift", "mean", "last"],
}
gscv = ForecastingGridSearchCV(forecaster=pipe, param_grid=param_grid, cv=cv, n_jobs=-1)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.1299046419013891
TODO - contributions in this section are appreciated
from aeon.forecasting.compose import EnsembleForecaster
ses = ExponentialSmoothing(sp=12)
holt = ExponentialSmoothing(trend="add", damped_trend=False, sp=12)
damped = ExponentialSmoothing(trend="add", damped_trend=True, sp=12)
forecaster = EnsembleForecaster(
[
("ses", ses),
("holt", holt),
("damped", damped),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.1661796803565588
For model evaluation, we sometimes want to evaluate multiple forecasts, using temporal cross-validation with a sliding window over the test data. For this purpose, we can leverage the forecasters from the online_forecasting
module which use a composite forecaster, PredictionWeightedEnsemble
, to keep track of the loss accumulated by each forecaster and create a prediction weighted by the predictions of the most "accurate" forecasters.
Note that the forecasting task is changed: we make 35 predictions since we need the first prediction to help update the weights, we do not predict 36 steps ahead.
from aeon.forecasting.online_learning import (
NormalHedgeEnsemble,
OnlineEnsembleForecaster,
)
from aeon.performance_metrics.forecasting import mean_squared_error
First we need to initialize a PredictionWeightedEnsembler
that will keep track of the loss accumulated by each forecaster and define which loss function we would like to use.
hedge_expert = NormalHedgeEnsemble(n_estimators=3, loss_func=mean_squared_error)
We can then create the forecaster by defining the individual forecasters and specifying the PredictionWeightedEnsembler
we are using. Then by fitting our forecasters and performing updates and prediction with the update_predict
function, we get:
forecaster = OnlineEnsembleForecaster(
[
("ses", ses),
("holt", holt),
("damped", damped),
],
ensemble_algorithm=hedge_expert,
)
forecaster.fit(y=y_train, fh=fh)
y_pred = forecaster.update_predict_single(y_test)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
0.0978975689038194
aeon
is meant to be easily extensible, for direct contribution to aeon
as well as for local/private extension with custom methods.
To get started:
python
file with todo
blocks that mark the places in which changes need to be added.aeon
or affiliated repository (if contributed extension), inside aeon.forecasting
; rename the file and update the file docstring appropriately.__init__
, _fit
, _predict
, and optional methods such as _update
(for details see the extension template). You can add private methods as long as they do not override the default public interface. For more details, see the extension template.aeon.testing.estimator_checks.check_estimator
on your estimator. You can call this on a class or object instance. Ensure you have specified test parameters in the get_test_params
method, according to the extension template.In case of direct contribution to aeon
or one of its affiliated packages, additionally:
CODEOWNERS
for the new estimator file(s).aeon
comes with several forecasting algorithms (or forecasters), all of which share a common interface. The interface is fully interoperable with the scikit-learn
interface, and provides dedicated interface points for forecasting in batch and rolling mode.
aeon
comes with rich composition functionality that allows to build complex pipelines easily, and connect easily with other parts of the open source ecosystem, such as scikit-learn
and individual algorithm libraries.
aeon
is easy to extend, and comes with user friendly tools to facilitate implementing and testing your own forecasters and composition principles.