The class reservoir_computing.RC_forecaster
allows to quickly perform forecasting by fitting a linear model that maps the reservoir states into the predictions. The linear is implemented as the ridge regressor from sklearn sklearn.linear_model.Ridge
.
It is however possible to use other regression models from sklearn, including those that computes confidence intervals obtaining, in this way, a probabilistic forecasting.
In this example we will use sklearn.ensemble.HistGradientBoostingRegressor
, a Gradient Boost Regression Tree (GBRT) that allows to compute different quantiles.
Let's start by importing the necessary libraries.
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.decomposition import PCA
from reservoir_computing.reservoir import Reservoir
from reservoir_computing.utils import make_forecasting_dataset
from reservoir_computing.datasets import PredLoader
np.random.seed(0) # For reproducibility
We will use the dataloader PredLoader
to get a forecasting datatset.
To see what datatsets are available, we can call the function available_datasets
. By setting details=True
we can get additional information.
downloader = PredLoader()
downloader.available_datasets(details=True) # Describe available datasets
Available datasets: ElecRome ----------- Univariate time series forecasting. Length: 137376 Features: 1 CDR ----------- Multivariate time series forecasting. Length: 3336 Features: 8
ElecRome
that is the electricity consumption coming from a backbone of the energy supply network in the city of Rome.3000
time steps to be faster in fitting the model.# Download data
ts_full = downloader.get_data("ElecRome")
# Resample the time series to hourly frequency
ts_hourly = np.mean(ts_full.reshape(-1, 6), axis=1)[:, None]
print("Resampled: ", ts_hourly.shape)
# Use only the first 3000 time steps
ts_small = ts_hourly[0:3000,:]
print("Resampled small: ", ts_small.shape)
Loaded ElecRome dataset. Data shape: X: (137376, 1) Resampled: (22896, 1) Resampled small: (3000, 1)
To train our regression model we need input and target data Xtr
and Ytr
.
We also need test data Xte
and Yte
to test our model and validation data Xval
and Yval
if we need to do hyperparameters tuning.
We will use the function make_forecasting_dataset
that given a time series X
does the following computations:
train
, val
and test
. The size of the chunks is given by the values val_percent
and test_percent
. If we do not need validation data, set val_percent=0
(default) and the validation data will not be created.X
and target data Y
by shifting the data horizon
time steps, where horizon
is how far we want to predict. For example:Xtr = train[:-horizon,:]
Ytr = train[horizon:,:]
sklearn.preprocessing
. If no scalers are passed, a StandardScaler
is created. The scaler is fit on Xtr
and then used to transform Ytr
, Xval
, and Xte
. Note that Yval
and Yte
are not transformed.The code below exemplifies the use of the function.
X = np.arange(36)[:, None]
Xtr, Ytr, Xte, Yte, Xval, Yval, scaler = make_forecasting_dataset(X, horizon=5,
test_percent=0.2,
val_percent=0.3)
print("Xtr: ", scaler.inverse_transform(Xtr.T)[0])
print("Ytr: ", scaler.inverse_transform(Ytr.T))
print("Xval: ", scaler.inverse_transform(Xval.T)[0])
print("Yval: ", Yval.T)
print("Xte: ", scaler.inverse_transform(Xte.T)[0])
print("Yte: ", Yte.T)
Xtr: [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.] Ytr: [[ 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.]] Xval: [17. 18. 19. 20. 21. 22.] Yval: [[22 23 24 25 26 27]] Xte: [28. 29. 30.] Yte: [[33 34 35]]
val_percent=0
and the validation data will not be returned).# Generate training and test datasets
Xtr, Ytr, Xte, Yte, scaler = make_forecasting_dataset(ts_small,
horizon=24, # forecast horizon of 24h ahead
test_percent = 0.1)
states_tr
and states_te
associated with the training and test data, respecitvely.res = Reservoir(n_internal_units=1000,
spectral_radius=0.95,
leak=None,
connectivity=0.25,
input_scaling=0.1,
noise_level=0.0,
circle=False)
n_drop=10
states_tr = res.get_states(Xtr[None,:,:], n_drop=n_drop, bidir=False)
states_te = res.get_states(Xte[None,:,:], n_drop=n_drop, bidir=False)
n_internal_units=1000
, we end up with a sequence of length T
of vectors with size 1000
.1000
to 75
.pca = PCA(n_components=75)
states_tr = pca.fit_transform(states_tr[0])
states_te = pca.transform(states_te[0])
# Fit the ridge regression model
ridge = Ridge(alpha=1.0)
ridge.fit(states_tr, Ytr[n_drop:,:])
# Compute the predictions
Yhat = ridge.predict(states_te)
Finally, we plot the results.
fig = plt.figure(figsize=(14,4))
plt.plot(Yte[n_drop:,:], 'k--', label="True", linewidth=2)
plt.plot(scaler.inverse_transform(Yhat), label="Predicted")
plt.grid()
plt.legend()
plt.title("True vs predicted electricity load")
plt.show()
# Quantile 0.5
max_iter = 100
gbrt_median = HistGradientBoostingRegressor(
loss="quantile", quantile=0.5, max_iter=max_iter)
gbrt_median.fit(states_tr, Ytr[n_drop:,0])
median_predictions = gbrt_median.predict(states_te)
# Quantile 0.05
gbrt_percentile_5 = HistGradientBoostingRegressor(
loss="quantile", quantile=0.05, max_iter=max_iter)
gbrt_percentile_5.fit(states_tr, Ytr[n_drop:,0])
percentile_5_predictions = gbrt_percentile_5.predict(states_te)
# Quantile 0.95
gbrt_percentile_95 = HistGradientBoostingRegressor(
loss="quantile", quantile=0.95, max_iter=max_iter)
gbrt_percentile_95.fit(states_tr, Ytr[n_drop:,0])
percentile_95_predictions = gbrt_percentile_95.predict(states_te)
Plot the results with the confidence 90% confidence intervals.
# Plot the results
fig = plt.figure(figsize=(14,4))
plt.plot(Yte[n_drop:,:], 'k--', label="True", linewidth=2)
plt.plot(scaler.inverse_transform(median_predictions[:,None]), label="Median prediction", color="tab:blue")
plt.fill_between(np.arange(len(Yte[n_drop:,:])), scaler.inverse_transform(percentile_5_predictions[:,None]).ravel(), scaler.inverse_transform(percentile_95_predictions[:,None]).ravel(), alpha=0.3, label="90% CI", color="tab:blue")
plt.grid()
plt.legend()
plt.title("Predicted electricity load using Gradient Boosting Regression Trees")
plt.show()