In this notebook, we demonstrate how to:
The data in this example is taken from the GEFCom2014 forecasting competition1. It consists of 3 years of hourly electricity load and temperature values between 2012 and 2014. The task is to forecast future values of electricity load.
1Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli and Rob J. Hyndman, "Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond", International Journal of Forecasting, vol.32, no.3, pp 896-913, July-September, 2016.
import os
import warnings
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import datetime as dt
from collections import UserDict
from IPython.display import Image
%matplotlib inline
from common.utils import load_data, mape, TimeSeriesTensor, create_evaluation_df
pd.options.display.float_format = '{:,.2f}'.format
np.set_printoptions(precision=2)
warnings.filterwarnings("ignore")
Load data into Pandas dataframe
energy = load_data('data/')
energy.head()
load | temp | |
---|---|---|
2012-01-01 00:00:00 | 2,698.00 | 32.00 |
2012-01-01 01:00:00 | 2,558.00 | 32.67 |
2012-01-01 02:00:00 | 2,444.00 | 30.00 |
2012-01-01 03:00:00 | 2,402.00 | 31.00 |
2012-01-01 04:00:00 | 2,403.00 | 32.00 |
valid_start_dt = '2014-09-01 00:00:00'
test_start_dt = '2014-11-01 00:00:00'
T = 6
HORIZON = 3
Create training set containing only the model features
train = energy.copy()[energy.index < valid_start_dt][['load', 'temp']]
Scale data to be in range (0, 1). This transformation should be calibrated on the training set only. This is to prevent information from the validation or test sets leaking into the training data.
from sklearn.preprocessing import MinMaxScaler
y_scaler = MinMaxScaler()
y_scaler.fit(train[['load']])
X_scaler = MinMaxScaler()
train[['load', 'temp']] = X_scaler.fit_transform(train)
Use the TimeSeriesTensor convenience class to:
The class takes the following parameters:
tensor_structure = {'X':(range(-T+1, 1), ['load', 'temp'])}
train_inputs = TimeSeriesTensor(train, 'load', HORIZON, {'X':(range(-T+1, 1), ['load', 'temp'])})
train_inputs.dataframe.head()
tensor | target | X | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature | y | load | temp | ||||||||||||
time step | t+1 | t+2 | t+3 | t-5 | t-4 | t-3 | t-2 | t-1 | t | t-5 | t-4 | t-3 | t-2 | t-1 | t |
2012-01-01 05:00:00 | 0.18 | 0.23 | 0.29 | 0.22 | 0.18 | 0.14 | 0.13 | 0.13 | 0.15 | 0.42 | 0.43 | 0.40 | 0.41 | 0.42 | 0.41 |
2012-01-01 06:00:00 | 0.23 | 0.29 | 0.35 | 0.18 | 0.14 | 0.13 | 0.13 | 0.15 | 0.18 | 0.43 | 0.40 | 0.41 | 0.42 | 0.41 | 0.40 |
2012-01-01 07:00:00 | 0.29 | 0.35 | 0.37 | 0.14 | 0.13 | 0.13 | 0.15 | 0.18 | 0.23 | 0.40 | 0.41 | 0.42 | 0.41 | 0.40 | 0.39 |
2012-01-01 08:00:00 | 0.35 | 0.37 | 0.37 | 0.13 | 0.13 | 0.15 | 0.18 | 0.23 | 0.29 | 0.41 | 0.42 | 0.41 | 0.40 | 0.39 | 0.39 |
2012-01-01 09:00:00 | 0.37 | 0.37 | 0.37 | 0.13 | 0.15 | 0.18 | 0.23 | 0.29 | 0.35 | 0.42 | 0.41 | 0.40 | 0.39 | 0.39 | 0.43 |
Construct validation set (keeping T hours from the training set in order to construct initial features)
look_back_dt = dt.datetime.strptime(valid_start_dt, '%Y-%m-%d %H:%M:%S') - dt.timedelta(hours=T-1)
valid = energy.copy()[(energy.index >=look_back_dt) & (energy.index < test_start_dt)][['load', 'temp']]
valid[['load', 'temp']] = X_scaler.transform(valid)
valid_inputs = TimeSeriesTensor(valid, 'load', HORIZON, tensor_structure)
We will implement a RNN forecasting model with the following structure:
Image('./images/simple_encoder_decoder.png')
from keras.models import Model, Sequential
from keras.layers import GRU, Dense, RepeatVector, TimeDistributed, Flatten
from keras.callbacks import EarlyStopping
Using TensorFlow backend.
LATENT_DIM = 5
BATCH_SIZE = 32
EPOCHS = 10
model = Sequential()
model.add(GRU(LATENT_DIM, input_shape=(T, 2)))
model.add(RepeatVector(HORIZON))
model.add(GRU(LATENT_DIM, return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.add(Flatten())
model.compile(optimizer='RMSprop', loss='mse')
model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= gru_1 (GRU) (None, 5) 120 _________________________________________________________________ repeat_vector_1 (RepeatVecto (None, 3, 5) 0 _________________________________________________________________ gru_2 (GRU) (None, 3, 5) 165 _________________________________________________________________ time_distributed_1 (TimeDist (None, 3, 1) 6 _________________________________________________________________ flatten_1 (Flatten) (None, 3) 0 ================================================================= Total params: 291 Trainable params: 291 Non-trainable params: 0 _________________________________________________________________
earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5)
model.fit(train_inputs['X'],
train_inputs['target'],
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(valid_inputs['X'], valid_inputs['target']),
callbacks=[earlystop],
verbose=1)
Train on 23368 samples, validate on 1461 samples Epoch 1/50 23368/23368 [==============================] - 4s 185us/step - loss: 0.0217 - val_loss: 0.0053 Epoch 2/50 23368/23368 [==============================] - 3s 129us/step - loss: 0.0050 - val_loss: 0.0042 Epoch 3/50 23368/23368 [==============================] - 3s 147us/step - loss: 0.0044 - val_loss: 0.0039 Epoch 4/50 23368/23368 [==============================] - 3s 129us/step - loss: 0.0041 - val_loss: 0.0035 Epoch 5/50 23368/23368 [==============================] - 3s 134us/step - loss: 0.0039 - val_loss: 0.0033 Epoch 6/50 23368/23368 [==============================] - 3s 138us/step - loss: 0.0037 - val_loss: 0.0032 Epoch 7/50 23368/23368 [==============================] - 3s 127us/step - loss: 0.0035 - val_loss: 0.0029 Epoch 8/50 23368/23368 [==============================] - 3s 130us/step - loss: 0.0034 - val_loss: 0.0029 Epoch 9/50 23368/23368 [==============================] - 4s 156us/step - loss: 0.0033 - val_loss: 0.0037 Epoch 10/50 23368/23368 [==============================] - 3s 123us/step - loss: 0.0033 - val_loss: 0.0029 Epoch 11/50 23368/23368 [==============================] - 4s 177us/step - loss: 0.0032 - val_loss: 0.0026 Epoch 12/50 23368/23368 [==============================] - 5s 195us/step - loss: 0.0032 - val_loss: 0.0032 Epoch 13/50 23368/23368 [==============================] - 4s 167us/step - loss: 0.0032 - val_loss: 0.0027 Epoch 14/50 23368/23368 [==============================] - 3s 135us/step - loss: 0.0032 - val_loss: 0.0028 Epoch 15/50 23368/23368 [==============================] - 4s 174us/step - loss: 0.0031 - val_loss: 0.0041 Epoch 16/50 23368/23368 [==============================] - 4s 172us/step - loss: 0.0031 - val_loss: 0.0034
<keras.callbacks.History at 0x1b6efcdb710>
look_back_dt = dt.datetime.strptime(test_start_dt, '%Y-%m-%d %H:%M:%S') - dt.timedelta(hours=T-1)
test = energy.copy()[test_start_dt:][['load', 'temp']]
test[['load', 'temp']] = X_scaler.transform(test)
test_inputs = TimeSeriesTensor(test, 'load', HORIZON, tensor_structure)
predictions = model.predict(test_inputs['X'])
predictions
array([[0.24, 0.31, 0.39], [0.31, 0.39, 0.46], [0.4 , 0.46, 0.51], ..., [0.62, 0.57, 0.51], [0.58, 0.54, 0.48], [0.54, 0.5 , 0.46]], dtype=float32)
eval_df = create_evaluation_df(predictions, test_inputs, HORIZON, y_scaler)
eval_df.head()
timestamp | h | prediction | actual | |
---|---|---|---|---|
0 | 2014-11-01 05:00:00 | t+1 | 2,748.30 | 2,714.00 |
1 | 2014-11-01 06:00:00 | t+1 | 2,994.21 | 2,970.00 |
2 | 2014-11-01 07:00:00 | t+1 | 3,267.59 | 3,189.00 |
3 | 2014-11-01 08:00:00 | t+1 | 3,405.73 | 3,356.00 |
4 | 2014-11-01 09:00:00 | t+1 | 3,537.20 | 3,436.00 |
eval_df['APE'] = (eval_df['prediction'] - eval_df['actual']).abs() / eval_df['actual']
eval_df.groupby('h')['APE'].mean()
h t+1 0.02 t+2 0.04 t+3 0.06 Name: APE, dtype: float64
mape(eval_df['prediction'], eval_df['actual'])
0.04260425015099984
Plot actuals vs predictions at each horizon for first week of the test period. As is to be expected, predictions for one step ahead (t+1) are more accurate than those for 2 or 3 steps ahead
plot_df = eval_df[(eval_df.timestamp<'2014-11-08') & (eval_df.h=='t+1')][['timestamp', 'actual']]
for t in range(1, HORIZON+1):
plot_df['t+'+str(t)] = eval_df[(eval_df.timestamp<'2014-11-08') & (eval_df.h=='t+'+str(t))]['prediction'].values
fig = plt.figure(figsize=(15, 8))
ax = plt.plot(plot_df['timestamp'], plot_df['actual'], color='red', linewidth=4.0)
ax = fig.add_subplot(111)
ax.plot(plot_df['timestamp'], plot_df['t+1'], color='blue', linewidth=4.0, alpha=0.75)
ax.plot(plot_df['timestamp'], plot_df['t+2'], color='blue', linewidth=3.0, alpha=0.5)
ax.plot(plot_df['timestamp'], plot_df['t+3'], color='blue', linewidth=2.0, alpha=0.25)
plt.xlabel('timestamp', fontsize=12)
plt.ylabel('load', fontsize=12)
ax.legend(loc='best')
plt.show()