The objective of the case study behind the paper was to validate different methods and approaches related to sales time series data preparation, analysis and forecasting, with aim to facilitate recommending sales and marketing strategies based on trend/seasonality effects and forecasting sales of eight different groups of pharmaceutical products with diverse characteristics, such as stationarity, seasonality, amount of residuals and sales data variance. All these analyses and forecasts are made on a small scale, for a single distributor, pharmacy chain or even individual pharmacy.
On a larger scale, the sales forecasting in pharmaceutical industry is typically done by using Naïve model, where the forecasted values equal values in the previous period with added factor of growth, which is specifically defined for different regions, markets, categories of products, etc. Although this model fails when the market saturates, in general and on a larger scale, it has proven as successful. Still, analysis and forecasts on a smaller scale, such as single distributor, pharmacy chain or even individual pharmacy, smaller periods such as weeks, etc., guide very important decisions related to resource and procurement planning, what-if analyses, return-on-investment forecasting, business planning and others. The main problem in smaller scale time series analyses and forecasts are significant uncertainties and sales performance very close to random, making the forecasts with accuracies above thresholds as defined by Naïve methods difficult to achieve.
The main research question we tackle is related to exploring the feasibility of use of modern time-series forecasting methods in pharmaceutical products sales forecasting on a smaller scale. In specific, we benchmark the accuracies achieved with those methods against the performances of basic Naïve, Seasonal Naïve and Average methods.
Research work behind the paper considers 8 time series with different statistical features. Each of the time-series summarizes sales of a group of pharmaceutical products. Time-series data are collected from the Point-of-Sale system of a single pharmacy in period of 6 years.
This paper is structured into 4 main parts. First, short theoretical background for time series analysis and forecasting is provided to inform the reader on the credibility of decisions made in the implementation of this case study. Then, research methodology, actually a problem-neutral time series forecasting pipeline is presented. Next, the actual implementation is presented, by highlighting the steps made in following the proposed methodology in the case of pharmaceutical products sales data analysis and forecasting. Finally, the discussion brings the description of actual results and some suggestions to the sales department, driven by the result of the data analysis.
The methodology for implementing this case study follows the typical time series forecasting pipeline, consisting of three major phases:
Based on the problem and objective formal definition, the data acquired from the sales information system are cleaned, feature engineering approach was defined, and all data are transformed to hourly time series, consisting of aggregate sales among different classes of pharmaceutical products in hourly time periods, namely: anti-inflammatory and antirheumatic products (M01AB, M01AE), analgesics and antipyretics (N02BA, N02BE), psycholeptics drugs (N05B, N05C), drugs for obstructive airway diseases (R03) and antihistamines for systemic use (R06). This, intermediary series is used for the formal definition of anomalies and their identification. Also, outliers are detected in consultation with pharmacy staff and treated by first, imputing the missing data and then, by imputing representative data, by using several methods. Finally, data is then rescaled to weekly time-series and stored.
Time series analysis had two-fold objective. First, annual, weekly and daily data analysis were done with objective to make potentially useful conclusions and propositions for improving sales and marketing strategies. Then, stationarity, autocorrelation and predictability analysis of the time series in individual groups was analyzed to infer the initial set of parameters for implementing the forecasting methods.
Forecasting was carried out at the weekly scale. Two different approaches to forecasting problem scoping were adopted. First one implements so called rolling forecast, namely forecasting the sales in the next week, by using the model trained with all historical data. Therefore, during testing, prediction in a timestep t is based on the model which fits the training set consisting of observations in timesteps (0,t-1), or: f(t) = f(o[0:t-1]). Rolling forecast model can be used for short-term resource planning and planning the procurement of stock of pharmaceutical products. Another approach is related to long-term forecasting, for example forecasting the future period of one year, by using the model trained with historical data. This model can be used for business planning and making decisions of strategic nature.
Train-test split validation with one last year of data (52 rows) was used for testing. Key performance indicator for forecasting accuracies in both approaches was Mean Squared Error (MSE). Mean Absolute Percentage Error (MAPE) was provided only as an illustration because data on different groups of pharmaceutical products were on significantly diverse scales. Baseline accuracy was calculated by using Naïve and Seasonal Naïve, for rolling forecasts and Average method for long-term ones. Three different models were tested: ARIMA/SARIMA (for rolling and long-term forecast), Facebook’s Prophet (for rolling and long-term forecast) and Long-Short Term Memory (LSTM) artificial neural network architectures (for long-term forecast).
Hyper-parameters were optimized by using three approaches: manually – with ACF/PACF plot analysis, Python’s statsmodels function and grid search optimization. Grid search was used as adopted approach for hyper-parameters optimization, for ARIMA and Prophet model. LSTM was applied only for long-term forecasting. The data preparation process for LSTM included transforming to stationary time series, sequencing time series to supervised problem data shape [Xt-n_steps...Xt-2,Xt-1][yt] (after determining input vector dimension which gives best accuracies), and time series scaling (normalization or standardization). Three LSTM architectures were tested: Vanilla LSTM, Stacked LSTM and Bidirectional LSTM. No optimization of hyper-parameters was carried out. In order to get comparable results, pseudo-random generators of Python, Numpy and Tensorflow were set to fixed values.
Solution will be using the set of Python libraries, namely numpy for computational tasks, pandas for data structures, matplotlib for plotting and others (libraries used for forecasting with LSTM are imported later). All warnings will be supressed for the cleaner presentation of this notebook (this is a bad practice, note that all presented code was tested). Also info messages are supressed, for the same reason.
Finally, pyramid-arima package will be installed, in order to use Auto-ARIMA method.
import warnings
import numpy as np
from numpy import array
import pandas as pd
from pandas import concat
import math
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from statsmodels.tsa.arima_model import ARIMA
from sklearn.model_selection import ParameterGrid
pd.plotting.register_matplotlib_converters()
warnings.filterwarnings("ignore")
import logging
logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)
def mean_absolute_percentage_error(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
!pip install pyramid-arima
Collecting pyramid-arima Downloading https://files.pythonhosted.org/packages/5c/84/57422f2a6ade3161c586545e38b518ba1b7ab30ee4a4acc29110c0aba2bc/pyramid_arima-0.9.0-cp36-cp36m-manylinux1_x86_64.whl (597kB) |████████████████████████████████| 604kB 2.8MB/s Requirement already satisfied: scipy>=0.9 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (1.3.3) Requirement already satisfied: statsmodels>=0.9.0 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (0.10.1) Requirement already satisfied: pandas>=0.19 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (0.25.3) Requirement already satisfied: Cython>=0.23 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (0.29.14) Requirement already satisfied: scikit-learn>=0.17 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (0.21.3) Requirement already satisfied: numpy>=1.10 in /opt/conda/lib/python3.6/site-packages (from pyramid-arima) (1.17.4) Requirement already satisfied: patsy>=0.4.0 in /opt/conda/lib/python3.6/site-packages (from statsmodels>=0.9.0->pyramid-arima) (0.5.1) Requirement already satisfied: python-dateutil>=2.6.1 in /opt/conda/lib/python3.6/site-packages (from pandas>=0.19->pyramid-arima) (2.8.0) Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.6/site-packages (from pandas>=0.19->pyramid-arima) (2019.3) Requirement already satisfied: joblib>=0.11 in /opt/conda/lib/python3.6/site-packages (from scikit-learn>=0.17->pyramid-arima) (0.14.0) Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from patsy>=0.4.0->statsmodels>=0.9.0->pyramid-arima) (1.13.0) Installing collected packages: pyramid-arima Successfully installed pyramid-arima-0.9.0
Time Series Analysis included seasonality, stationarity, autocorrelation, regularity and data distribution analysis.
The seasonality patterns can be explored in detail by using boxplots. Seasonality is clearly confirmed for the categories of R03, R06 and N02BE. Some additional conclusions: R03 and N05C has more outliers that the others, indicating that their sales is more difficult to predict.
import seaborn as sns
dfatc_daily = pd.read_csv('../input/salesdaily.csv')
fig, axes = plt.subplots(8, 1, figsize=(10, 30), sharex=True)
for name, ax in zip(['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06'], axes):
sns.boxplot(data=dfatc_daily, x='Month', y=name, ax=ax)
Below, boxplots on a weekly scale are shown, for the purpose of exploring the weakly seasonality. Some weekly seasonality is visible.
fig, axes = plt.subplots(8, 1, figsize=(10, 30), sharex=True)
for name, ax in zip(['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06'], axes):
sns.boxplot(data=dfatc_daily, x='Weekday Name', y=name, ax=ax)
Another visualization that can be useful for discovering seasonality patterns is related to rolling window means. Analysis below shows 30-day and 365-day rolling mean and 30-day rolling standard deviation of sales data.
dfatc_daily=pd.read_csv('/content/salesdaily.csv')
cols_plot = ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']
dfatc_365d = dfatc_daily[cols_plot].rolling(window=365, center=True).mean()
dfatc_30d = dfatc_daily[cols_plot].rolling(30, center=True).mean()
dfatc_std = dfatc_daily[cols_plot].rolling(30, center=True).std()
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18, 12))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
for x in cols_plot:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
ax[rowindex,colindex].plot(dfatc_daily.loc[:,x], linewidth=0.5, label='Daily sales')
ax[rowindex,colindex].plot(dfatc_30d.loc[:,x], label='30-d Rolling Mean')
ax[rowindex,colindex].plot(dfatc_365d.loc[:,x], color='0.2', linewidth=3, label='365-d Rolling Mean')
ax[rowindex,colindex].plot(dfatc_std.loc[:,x], color='0.5', linewidth=3, label='30-d Rolling Std')
ax[rowindex,colindex].set_ylabel('Sales')
ax[rowindex,colindex].legend()
ax[rowindex,colindex].set_title('Trends in '+x+' drugs sales');
subplotindex=subplotindex+1
plt.show()
Image below shows trends for each of the drug categories, represented by the 365-d rolling means for each of those categories.
fig, ax = plt.subplots(figsize=(16,9))
for nm in cols_plot:
ax.plot(dfatc_365d[nm], label=nm, marker='.', linestyle='-', linewidth=0.5)
ax.legend()
ax.set_ylabel('Drug sales')
ax.set_title('Trends in Drug Sales for different groups (365-d Rolling Means)');
Trends and seasonality can be explored in time series decomposition view, based on 30d rolling means.
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(dfatc_daily['M01AB'].rolling(30, center=True).mean().dropna(), freq=365, filt=None)
plt.rcParams["figure.figsize"] = (16,9)
result.plot()
plt.show()
Function seasonal_decompose can be used for analysis of the portions of each component of time series. This is especially useful when determining uptake of residuals in data, based on the decomposed data. The volume of this uptake implies the predictability of the time series - higher the residuals, lower the predictability. To some extent, the proportion of the residuals when comparing with trend and seasonality can be also illustrated by the rolling means and standard deviation plots above.
df = pd.read_csv('/content/salesweekly.csv')
for x in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
result = seasonal_decompose(df[x], freq=52, model='additive')
dfs = pd.concat([result.trend, result.seasonal, result.resid, result.observed], axis=1)
dfs.columns = ['trend', 'seasonal', 'residuals', 'observed']
dfs=dfs.dropna()
res=dfs['residuals'].values
obs=dfs['observed'].values
resmean=np.mean(np.abs(res))
obsmean=np.mean(np.abs(obs))
perc=resmean*100/obsmean
print(x+' RESMEAN:'+str(resmean)+', OBSMEAN:'+str(obsmean)+', PERC:'+str(perc)+'%')
M01AB RESMEAN:5.26715996284115, OBSMEAN:35.59490833332001, PERC:14.797509558159527% M01AE RESMEAN:4.319542609675869, OBSMEAN:28.00801458336, PERC:15.422523423856601% N02BA RESMEAN:3.922838959252166, OBSMEAN:27.083016, PERC:14.484498178682042% N02BE RESMEAN:29.53435723696367, OBSMEAN:217.6597028336, PERC:13.569051529737028% N05B RESMEAN:12.94840305932125, OBSMEAN:61.96614999972, PERC:20.895929566997072% N05C RESMEAN:2.0384606385595405, OBSMEAN:3.871833333332, PERC:52.648460382081936% R03 RESMEAN:11.722244335544508, OBSMEAN:40.06845833336, PERC:29.25554119906046% R06 RESMEAN:4.278758416393868, OBSMEAN:19.744589999960002, PERC:21.67053565762842%
Stationarity of time-series is the property of exhibiting constant statistical properties over time (for example, mean, variance, autocorrelation). It can be visually determined by plotting rolling statistics (rolling means and variances). In stationary time series, the mean of the series, variance of the series and covariance of the i th term and the (i + m) th term should not be a function of time.
We can use Augmented Dickey-Fuller (ADF) test to check stationarity of the data. Possible values of regression parameters of ADF are:
df = pd.read_csv('/content/salesweekly.csv')
from statsmodels.tsa.stattools import adfuller
for x in ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']:
dftest = adfuller(df[x], regression='ct', autolag='AIC')
print("ADF test for "+x)
print("-----------------------------")
print("Test statistic = {:.3f}".format(dftest[0]))
print("P-value = {:.3f}".format(dftest[1]))
print("Critical values :")
for k, v in dftest[4].items():
print("\t{}: {} - The data is {} stationary with {}% confidence".format(k, v, "not" if v<dftest[0] else "", 100-int(k[:-1])))
ADF test for M01AB ----------------------------- Test statistic = -3.700 P-value = 0.022 Critical values : 1%: -3.9897903649837616 - The data is not stationary with 99% confidence 5%: -3.425478311521626 - The data is stationary with 95% confidence 10%: -3.1358607194990724 - The data is stationary with 90% confidence ADF test for M01AE ----------------------------- Test statistic = -8.717 P-value = 0.000 Critical values : 1%: -3.98926783537037 - The data is stationary with 99% confidence 5%: -3.425226747185185 - The data is stationary with 95% confidence 10%: -3.1357131066666666 - The data is stationary with 90% confidence ADF test for N02BA ----------------------------- Test statistic = -2.670 P-value = 0.249 Critical values : 1%: -3.990441532957606 - The data is not stationary with 99% confidence 5%: -3.425791763669738 - The data is not stationary with 95% confidence 10%: -3.1360446341572343 - The data is not stationary with 90% confidence ADF test for N02BE ----------------------------- Test statistic = -4.362 P-value = 0.003 Critical values : 1%: -3.98926783537037 - The data is stationary with 99% confidence 5%: -3.425226747185185 - The data is stationary with 95% confidence 10%: -3.1357131066666666 - The data is stationary with 90% confidence ADF test for N05B ----------------------------- Test statistic = -3.782 P-value = 0.018 Critical values : 1%: -3.9895792190177497 - The data is not stationary with 99% confidence 5%: -3.4253766620686186 - The data is stationary with 95% confidence 10%: -3.135801074760346 - The data is stationary with 90% confidence ADF test for N05C ----------------------------- Test statistic = -14.974 P-value = 0.000 Critical values : 1%: -3.9891654349033057 - The data is stationary with 99% confidence 5%: -3.4251774443348975 - The data is stationary with 95% confidence 10%: -3.1356841756951854 - The data is stationary with 90% confidence ADF test for R03 ----------------------------- Test statistic = -3.806 P-value = 0.016 Critical values : 1%: -3.9898970270434053 - The data is not stationary with 99% confidence 5%: -3.4255296586298916 - The data is stationary with 95% confidence 10%: -3.1358908478232332 - The data is stationary with 90% confidence ADF test for R06 ----------------------------- Test statistic = -5.258 P-value = 0.000 Critical values : 1%: -3.9906647072202452 - The data is stationary with 99% confidence 5%: -3.4258991818227846 - The data is stationary with 95% confidence 10%: -3.1361076573601 - The data is stationary with 90% confidence
Augmented Dickey-Fuller (ADF) test have shown that all data, but N02BA (P-value=0.249) in the series were stationary, with maximum confidence.
from statsmodels.tsa.stattools import kpss
warnings.filterwarnings("ignore")
df = pd.read_csv('/content/salesweekly.csv')
for x in ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']:
print(" > Is "+x+" data stationary ?")
dftest = kpss(np.log(df[x]), 'ct')
print("Test statistic = {:.3f}".format(dftest[0]))
print("P-value = {:.3f}".format(dftest[1]))
print("Critical values :")
for k, v in dftest[3].items():
print("\t{}: {}".format(k, v))
> Is M01AB data stationary ? Test statistic = 0.285 P-value = 0.010 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is M01AE data stationary ? Test statistic = 0.242 P-value = 0.010 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is N02BA data stationary ? Test statistic = 0.147 P-value = 0.049 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is N02BE data stationary ? Test statistic = 0.076 P-value = 0.100 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is N05B data stationary ? Test statistic = 0.143 P-value = 0.056 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is N05C data stationary ? Test statistic = nan P-value = nan Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is R03 data stationary ? Test statistic = 0.046 P-value = 0.100 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216 > Is R06 data stationary ? Test statistic = 0.027 P-value = 0.100 Critical values : 10%: 0.119 5%: 0.146 2.5%: 0.176 1%: 0.216
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test found the trend non-stationarity in N02BE, R03 and R06.
For calculating regularity and predictability of time series, Approximate Entropy test was used. For all series, entropy values were higher than 1 indicating low predictability, with highest values for M01AE, M01AB and N02BA.
df = pd.read_csv('/content/salesweekly.csv')
def ApEn(U, m, r):
def _maxdist(x_i, x_j):
return max([abs(ua - va) for ua, va in zip(x_i, x_j)])
def _phi(m):
x = [[U[j] for j in range(i, i + m - 1 + 1)] for i in range(N - m + 1)]
C = [len([1 for x_j in x if _maxdist(x_i, x_j) <= r]) / (N - m + 1.0) for x_i in x]
return (N - m + 1.0)**(-1) * sum(np.log(C))
N = len(U)
return abs(_phi(m+1) - _phi(m))
for x in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
print(x + ': ' + str(ApEn(df[x].values, m=2, r=0.2*np.std(df[x].values))))
M01AB: 1.141130089570642 M01AE: 1.166363924596575 N02BA: 1.1370638730125302 N02BE: 1.058024809082593 N05B: 1.074437415034502 N05C: 1.0361887401424648 R03: 1.1847216239035152 R06: 1.031759595747876
Autocorrelation analysis illustrates the potential for time series data prediction. Autocorrelation plots graphically summarize the strength of a relationship with an observation in a time series with observations at prior time steps. Pearson coefficient is used to measure autocorrelation. Thus, the following analysis is relevant only for data with normal Gaussian distribution.
A plot of the autocorrelation of a time series by lag is called the AutoCorrelation Function (ACF). This plot is sometimes called a correlogram or an autocorrelation plot. Plot shows the lag value along the x-axis and the correlation on the y-axis between -1 and 1. Confidence intervals are drawn as a cone. By default, this is set to a 95% confidence interval, suggesting that correlation values outside of this code are very likely a correlation.
from statsmodels.graphics.tsaplots import plot_acf
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,12))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
with plt.rc_context():
plt.rc("figure", figsize=(18,12))
for x in ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
plot_acf(df[x], lags=300, title=x, ax=ax[rowindex,colindex])
subplotindex=subplotindex+1
In general, the "partial" correlation between two variables is the amount of correlation between them which is not explained by their mutual correlations with a specified set of other variables. For example, if we are regressing a variable Y on other variables X1, X2, and X3, the partial correlation between Y and X3 is the amount of correlation between Y and X3 that is not explained by their common correlations with X1 and X2.
from statsmodels.graphics.tsaplots import plot_pacf
from statsmodels.graphics.tsaplots import plot_acf
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,12))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
with plt.rc_context():
plt.rc("figure", figsize=(14,6))
for x in ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
plot_pacf(df[x], lags=100, title=x, ax=ax[rowindex,colindex])
subplotindex=subplotindex+1
Minor autocorrelation is observed at ACF (Auto-Correlation Function) and PACF (Partial Auto-Correlation Function) plots for all series, with exception of N05C sales. N02BE, R03 and R06 series were found to exhibit annual seasonality.
Chart with daily sales for different categories of interest is shown below. N02BE and N05B charts, though showing the similar trends, are suppresed because of the larger scale which makes the other illustrations less readable.
df = pd.read_csv('/content/saleshourly.csv')
dfatch['datum']= pd.to_datetime(dfatch['datum'])
grp1=dfatch.groupby(dfatch.datum.dt.hour)['M01AB'].mean()
grp2=dfatch.groupby(dfatch.datum.dt.hour)['M01AE'].mean()
grp3=dfatch.groupby(dfatch.datum.dt.hour)['N02BA'].mean()
grp6=dfatch.groupby(dfatch.datum.dt.hour)['N05C'].mean()
grp7=dfatch.groupby(dfatch.datum.dt.hour)['R03'].mean()
grp8=dfatch.groupby(dfatch.datum.dt.hour)['R06'].mean()
plt.title('Daily average sales')
plt.xlabel('Time of day')
plt.ylabel('Quantity of sale')
grp1.plot(figsize=(8,6))
grp2.plot(figsize=(8,6))
grp3.plot(figsize=(8,6))
grp6.plot(figsize=(8,6))
grp7.plot(figsize=(8,6))
grp8.plot(figsize=(8,6))
plt.legend(['M01AB', 'M01AE', 'N02BA', 'N05C', 'R03', 'R06'], loc='upper left')
plt.show()
Forecasting models were fitted with weekly time-series data with dataset of 302 rows. Three forecasting methods were tested: ARIMA, Prophet and LSTM. Train-test split method was used (52 weeks of test data). MSE was used as metrics for comparing the performance and also as a loss function for LSTM. Mean Absolute Percentage Error (MAPE) is also measured for each of the scenarios. In order to define the baseline forecasting accuracy to improve from, three tests were performed. Average method was used as a baseline for long-term forecasting, while Naïve and Seasonal Naïve were used for rolling forecasts.
Forecasting results are validated by using two approaches: short-term and long-term forecasts. For short-term forecasts (or so-called rolling-forecast) validation or walk-forward model validation, train-test split validation is performed in iterative fashion, where observations are added to the training dataset after every individual weekly sales prediction, while the model is fitted in each iteration. Therefore, during testing, prediction in a timestep t is based on the model which fits the training set consisting of observations in timesteps (0,t-1), or: f(t) = f(o[0:t-1]). Long-term forecasts validation is based on MSE of one-year forecast, compared to actual observations.
All MSE metrics will be stored in a dataframe, which will be later use for presentation of the overall results and comparison. Rolling forecasting will be carried out by using 5 methods (Naive, Seasonal Naive, ARIMA, AutoARIMA, Prophet). Longterm forecasting will be carried out by using 7 methods (Average, ARIMA, AutoARIMA, Prophet, Vanilla LSTM, Stacked LSTM, Bidirectional LSTM).
resultsRolling={'M01AB':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'M01AE':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N02BA':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N02BE':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N05B':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N05C':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'R03':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'R06':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]}
resultsRollingdf = pd.DataFrame(resultsRolling)
resultsRollingdf.index = ['Naive MSE', 'Naive MAPE', 'Seasonal Naive MSE', 'Seasonal Naive MAPE',
'ARIMA MSE', 'ARIMA MAPE', 'AutoARIMA MSE', 'AutoARIMA MAPE',
'Prophet MSE', 'Prophet MAPE']
resultsLongterm={'M01AB':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'M01AE':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N02BA':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N02BE':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N05B':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'N05C':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'R03':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0],
'R06':[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]}
resultsLongtermdf = pd.DataFrame(resultsLongterm)
resultsLongtermdf.index = ['Average MSE', 'Average MAPE', 'ARIMA MSE', 'ARIMA MAPE',
'AutoARIMA MSE', 'AutoARIMA MAPE', 'Prophet MSE', 'Prophet MAPE',
'Vanilla LSTM MSE', 'Vanilla LSTM MAPE', 'Stacked LSTM MSE', 'Stacked LSTM MAPE',
'Bidirectional LSTM MSE', 'Bidirectional LSTM MAPE']
In this subsection, three summaries of the reference benchmarks, namely Naïve methods are provided. First, Naïve forecasting was done and results presented. Second, average method was used to forecast. Finally, seasonal Naïve forecast was carried out for the series that has been found as seasonal: N02BE, R03 and R06. See Theoretical background section for explanation of those methods.
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
for x in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
ds=df[x]
dataframe = concat([ds.shift(1), ds], axis=1)
dataframe.columns = ['t+1', 't-1']
size = len(dataframe)-50
X=dataframe['t-1']
Y=dataframe['t+1']
test, predictions = X[size:len(X)], Y[size:len(Y)]
error = mean_squared_error(test, predictions)
perror = mean_absolute_percentage_error(test, predictions)
resultsRollingdf.loc['Naive MSE',x]=error
resultsRollingdf.loc['Naive MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
for x in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
size = len(X)-50
test = X[size:len(X)]
mean = np.mean(X[0:size])
predictions = np.full(50,mean)
error = mean_squared_error(test, predictions)
perror = mean_absolute_percentage_error(test, predictions)
resultsLongtermdf.loc['Average MSE',x]=error
resultsLongtermdf.loc['Average MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=1
numcols=3
fig, ax = plt.subplots(numrows, numcols, figsize=(18,4))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
for x in ['N02BE','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
size = len(X)-52
test = X[size:len(X)]
train = X[0:size]
predictions=list()
history = [x for x in train]
for i in range(len(test)):
obs=list()
for y in range(1,5):
obs.append(train[-(y*52)+i])
yhat = np.mean(obs)
predictions.append(yhat)
history.append(test[i])
error = mean_squared_error(test, predictions)
perror = mean_absolute_percentage_error(test, predictions)
resultsRollingdf.loc['Seasonal Naive MSE',x]=error
resultsRollingdf.loc['Seasonal Naive MAPE',x]=perror
ax[colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[colindex].plot(test)
ax[colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
ARIMA method was used to to carry out short-term (rolling forecast) and long-term forecasting based on test data. Before each forecast is made, the process of optimizing hyper-parameters (p,d,q) of ARIMA model was carry out. Then, with optimal set of parameters, rolling forecast and long-term forecasting was carried out. Seasonal ARIMA (SARIMA) method was tested by using Auto-ARIMA implementation. This implementation has built-in stepwise optimizer which chooses optimal set of parameters (p,d,q for non-seasonal and p,d,q,P,D,Q for seasonal series, namely N02BE, R03 and R06).
First, method arma_order_select_ic was used to determine initial p and q parameters. The method computes Akaike’s Information Criterion (AIC) for many ARIMA models and chooses the best configuration. It is important to note that AIC tend not to be a good guide to selecting the appropriate order of differencing (d) of a model, but only for selecting the values of p and q.
import statsmodels.api as sm
df = pd.read_csv('/content/salesweekly.csv')
warnings.filterwarnings("ignore")
for x in ['M01AB','M01AE','N02BA','N02BE', 'N05B','N05C','R03','R06']:
resDiff = sm.tsa.arma_order_select_ic(df[x], max_ar=5, max_ma=5, ic='aic', trend='c')
print('ARMA(p,q,'+x+') =',resDiff['aic_min_order'],'is the best.')
ARMA(p,q,M01AB) = (3, 3) is the best. ARMA(p,q,M01AE) = (2, 0) is the best. ARMA(p,q,N02BA) = (5, 4) is the best. ARMA(p,q,N02BE) = (2, 2) is the best. ARMA(p,q,N05B) = (4, 3) is the best. ARMA(p,q,N05C) = (2, 3) is the best. ARMA(p,q,R03) = (3, 3) is the best. ARMA(p,q,R06) = (2, 2) is the best.
However, AIC is not used to score accuracy of the forecasting methods in this research. Mean squared error is used instead. For that, reason, grid search optimization method was applied, where different combinations of the hyper-parameters were used to calculate MSE and then, the combination producing the least MSE was chosen as optimal. Grid search optimization is carried out for both approaches in forecasting: rolling and long-term.
Grid search optimization for rolling forecast produced the following best combinations of the hyper-parameters:
def evaluate_arima_model(X, arima_order):
train_size = int(len(X) * 0.66)
train, test = X[0:train_size], X[train_size:]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
model = ARIMA(history, order=arima_order)
model_fit = model.fit(disp=0)
yhat = model_fit.forecast()[0]
predictions.append(yhat)
history.append(test[t])
error = mean_squared_error(test, predictions)
return error
def evaluate_models(f, dataset, p_values, d_values, q_values):
best_score, best_cfg = float("inf"), None
for p in p_values:
for d in d_values:
for q in q_values:
order = (p,d,q)
try:
mse = evaluate_arima_model(dataset, order)
if mse < best_score:
best_score, best_cfg = mse, order
except:
continue
print(f+' - Best ARIMA%s MSE=%.3f' % (best_cfg, best_score))
p_values = range(0, 6)
d_values = range(0, 2)
q_values = range(0, 6)
warnings.filterwarnings("ignore")
df = pd.read_csv('/content/salesweekly.csv')
for f in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
evaluate_models(f, df[f].values, p_values, d_values, q_values)
M01AB - Best ARIMA(0, 0, 0) MSE=61.971 M01AE - Best ARIMA(2, 0, 0) MSE=54.293 N02BA - Best ARIMA(5, 1, 1) MSE=29.998 N02BE - Best ARIMA(2, 0, 0) MSE=2080.523 N05B - Best ARIMA(0, 0, 5) MSE=195.096 N05C - Best ARIMA(0, 0, 1) MSE=11.383 R03 - Best ARIMA(5, 1, 1) MSE=507.120 R06 - Best ARIMA(1, 0, 1) MSE=67.824
Grid search optimization for long-term forecast produced the following best combinations of the hyper-parameters:
def evaluate_arima_model(X, arima_order):
train_size = int(len(X) - 50)
train, test = X[0:train_size], X[train_size:]
model = ARIMA(train, order=arima_order)
model_fit = model.fit()
forecast = model_fit.predict(1,len(test))
error = mean_squared_error(test, forecast)
return error
def evaluate_models(f, dataset, p_values, d_values, q_values):
best_score, best_cfg = float("inf"), None
for p in p_values:
for d in d_values:
for q in q_values:
order = (p,d,q)
try:
mse = evaluate_arima_model(dataset, order)
if mse < best_score:
best_score, best_cfg = mse, order
except:
continue
print(f+' - Best ARIMA%s MSE=%.3f' % (best_cfg, best_score))
p_values = range(0, 6)
d_values = range(0, 2)
q_values = range(0, 6)
warnings.filterwarnings("ignore")
for f in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
evaluate_models(f, df[f].values, p_values, d_values, q_values)
M01AB - Best ARIMA(0, 0, 0) MSE=75.429 M01AE - Best ARIMA(4, 0, 4) MSE=98.406 N02BA - Best ARIMA(0, 0, 0) MSE=76.461 N02BE - Best ARIMA(0, 0, 0) MSE=6837.270 N05B - Best ARIMA(0, 0, 0) MSE=162.051 N05C - Best ARIMA(2, 0, 2) MSE=7.654 R03 - Best ARIMA(0, 0, 0) MSE=1191.186 R06 - Best ARIMA(2, 0, 2) MSE=162.095
From the optimization stand of point, some series are considered as so-called white noise - random time series data. Those are series for whom the best result was achieved with p=0, d=0 and q=0.
df = pd.read_csv('/content/salesweekly.csv')
M01AB= {'series':'M01AB','p':0,'d':0,'q':0}
M01AE= {'series':'M01AE','p':2,'d':0,'q':0}
N02BA= {'series':'N02BA','p':5,'d':1,'q':1}
N02BE= {'series':'N02BE','p':2,'d':0,'q':0}
N05B= {'series':'N05B','p':0,'d':0,'q':5}
N05C= {'series':'N05C','p':0,'d':0,'q':1}
R03= {'series':'R03','p':5,'d':1,'q':1}
R06= {'series':'R06','p':1,'d':0,'q':1}
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
for x in [M01AB,M01AE,N02BA,N02BE,N05B,N05C,R03,R06]:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X = df[x['series']].values
size = len(X)-50
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
model = ARIMA(history, order=(x['p'],x['d'],x['q']))
model_fit = model.fit(disp=0)
output = model_fit.forecast()
yhat = output[0]
predictions.append(yhat)
obs = test[t]
history.append(obs)
error = mean_squared_error(test, predictions)
perror = mean_absolute_percentage_error(test, predictions)
resultsRollingdf.loc['ARIMA MSE',x['series']]=error
resultsRollingdf.loc['ARIMA MAPE',x['series']]=perror
ax[rowindex,colindex].set_title(x['series']+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
M01AB= {'series':'M01AB','p':0,'d':0,'q':0}
M01AE= {'series':'M01AE','p':4,'d':0,'q':4}
N02BA= {'series':'N02BA','p':0,'d':0,'q':0}
N02BE= {'series':'N02BE','p':0,'d':0,'q':0}
N05B= {'series':'N05B','p':0,'d':0,'q':0}
N05C= {'series':'N05C','p':2,'d':0,'q':2}
R03= {'series':'R03','p':0,'d':0,'q':0}
R06= {'series':'R06','p':2,'d':0,'q':2}
subplotindex=0
numrows=3
numcols=3
fig, ax = plt.subplots(numrows, numcols, figsize=(18,12))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
for x in [M01AB,M01AE,N02BA,N02BE,N05B,N05C,R03,R06]:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X = df[x['series']].values
size = int(len(X) - 50)
train, test = X[0:size], X[size:len(X)]
model = ARIMA(train, order=(x['p'],x['d'],x['q']))
model_fit = model.fit()
forecast = model_fit.predict(1,len(test))
error = mean_squared_error(test, forecast)
perror = mean_absolute_percentage_error(test, forecast)
resultsLongtermdf.loc['ARIMA MSE',x['series']]=error
resultsLongtermdf.loc['ARIMA MAPE',x['series']]=perror
ax[rowindex,colindex].set_title(x['series']+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(forecast, color='red')
subplotindex=subplotindex+1
plt.show()
from pyramid.arima import auto_arima
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=2
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
for x in ['M01AB','M01AE','N05B','N05C']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X = df[x].values
size = len(X)-50
train, test = X[0:size], X[size:len(X)]
history = [c for c in train]
predictions = list()
for t in range(len(test)):
if (x=='N02BA' or x=='N02BE' or x=='R03' or x=='R06'):
model = auto_arima(X, start_p=1, start_q=1,
max_p=5, max_q=5, m=52, max_d=1, max_D=1,
start_P=0, start_Q=0, max_P=5, max_Q=5, seasonal=True,
trace=False,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
else:
model = auto_arima(X, start_p=1, start_q=1,
max_p=5, max_q=5, max_d=1,
trace=False, seasonal=False,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
model_fit = model.fit(history)
output = model_fit.predict(n_periods=1)
yhat = output[0]
predictions.append(yhat)
obs = test[t]
history.append(obs)
error = mean_squared_error(test, predictions)
perror = mean_absolute_percentage_error(test, predictions)
resultsRollingdf.loc['AutoARIMA MSE',x]=error
resultsRollingdf.loc['AutoARIMA MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
from pyramid.arima import auto_arima
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
for x in ['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X = df[x].values
size = int(len(X) - 50)
train, test = X[0:size], X[size:len(X)]
if (x=='N02BE' or x=='R03' or x=='R06'):
model = auto_arima(X, start_p=1, start_q=1,
max_p=5, max_q=5, m=52, max_d=1, max_D=1,
start_P=0, start_Q=0, max_P=5, max_Q=5, seasonal=True,
trace=False,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
else:
model = auto_arima(X, start_p=1, start_q=1,
max_p=5, max_q=5, max_d=1,
trace=False, seasonal=False,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
model_fit = model.fit(train)
forecast = model_fit.predict(n_periods=len(test))
error = mean_squared_error(test, forecast)
perror = mean_absolute_percentage_error(test, predictions)
resultsLongtermdf.loc['AutoARIMA MSE',x]=error
resultsLongtermdf.loc['AutoARIMA MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(test)
ax[rowindex,colindex].plot(forecast, color='red')
subplotindex=subplotindex+1
plt.show()
Prophet model is fited with data in two columns, where first one contains time information and is labeled as ds. Another stores actual time series data and is labeled as y.
Prophet model is tuned by using the following hyper-parameters:
While Prophet model is used both for rolling forecasts and long-term forecasting, optimization of hyper-parameters is carried out only for long-term forecasts. Optimal sets of hyper-parameters are then used also for rolling forecasts.
import fbprophet
Before actual forecasting, the optimal set of hyper-parameters is determined by using grid-search method. NOTE: this is illustrative example; actual optimization is carried out with larger intervals of hyper-parameters' values.
df = pd.read_csv('/content/salesweekly.csv')
M01AB= {
'series':'M01AB',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[10,30,50],
'interval_width':[0.0005]
}
}
M01AE= {
'series':'M01AE',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[0.01,0.05,0.1],
'interval_width':[0.0005]
}
}
N02BA= {
'series':'N02BA',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[0.005,0.01,0.05,0.1],
'interval_width':[0.0005]
}
}
N02BE= {
'series':'N02BE',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[5,10,50],'seasonality_prior_scale':[150,170,200],
'interval_width':[0.0005]
}
}
N05B= {
'series':'N05B',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[1,5,10],
'interval_width':[0.0005]
}
}
N05C= {
'series':'N05C',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[0.05,0.08,0.1,0.5],
'interval_width':[0.0005]
}
}
R03= {
'series':'R03',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[0.01,0.05,0.1],'seasonality_prior_scale':[120,160,200],
'interval_width':[0.0005]
}
}
R06= {
'series':'R06',
'params_grid':{'growth':['linear'],'changepoint_prior_scale':[0.01,0.05,0.1],'seasonality_prior_scale':[100,120,160,200],
'interval_width':[0.0005]
}
}
r=[M01AB,M01AE,N02BA,N02BE,N05B,N05C,R03,R06]
warnings.filterwarnings("ignore")
for x in r:
dfg=df[['datum',x['series']]]
dfg = dfg.rename(columns={'datum': 'ds', x['series']: 'y'})
size = int(len(dfg) - 50)
dfgtrain=dfg.loc[0:size,:]
dfgtest=dfg.loc[size+1:len(dfg),:]
predictions = list()
minError=0
grid = ParameterGrid(x['params_grid'])
for p in grid:
model = fbprophet.Prophet(**p, daily_seasonality=False, weekly_seasonality=False)
if(x['series']=='N02BE' or x['series']=='R03' or x['series']=='R06'):
model=model.add_seasonality(
name='yearly',
period=365.25,
fourier_order=13)
model_fit = model.fit(dfgtrain)
future = model.make_future_dataframe(periods=50, freq='W')
output = model.predict(future)
predictions=output.loc[size+2:len(dfg),:]['yhat'].values
error = mean_squared_error(dfgtest['y'].values, predictions)
if(minError>0):
if(error<minError):
minError=error
minP=p
else:
minError=error
minP=p
print(minP)
print('Test MSE ('+x['series']+'): %.3f' % minError)
{'changepoint_prior_scale': 30, 'growth': 'linear', 'interval_width': 0.0005} Test MSE (M01AB): 69.638 {'changepoint_prior_scale': 0.05, 'growth': 'linear', 'interval_width': 0.0005} Test MSE (M01AE): 79.585 {'changepoint_prior_scale': 0.01, 'growth': 'linear', 'interval_width': 0.0005} Test MSE (N02BA): 32.258 {'changepoint_prior_scale': 10, 'growth': 'linear', 'interval_width': 0.0005, 'seasonality_prior_scale': 200} Test MSE (N02BE): 3073.610 {'changepoint_prior_scale': 5, 'growth': 'linear', 'interval_width': 0.0005} Test MSE (N05B): 300.586 {'changepoint_prior_scale': 0.5, 'growth': 'linear', 'interval_width': 0.0005} Test MSE (N05C): 8.372 {'changepoint_prior_scale': 0.05, 'growth': 'linear', 'interval_width': 0.0005, 'seasonality_prior_scale': 160} Test MSE (R03): 837.131 {'changepoint_prior_scale': 0.05, 'growth': 'linear', 'interval_width': 0.0005, 'seasonality_prior_scale': 120} Test MSE (R06): 76.725
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
M01AB= {'series':'M01AB','params_grid':{'changepoint_prior_scale':30,'interval_width':0.0005}}
M01AE= {'series':'M01AE','params_grid':{'changepoint_prior_scale':0.05,'interval_width':0.0005}}
N02BA= {'series':'N02BA','params_grid':{'changepoint_prior_scale':0.005,'interval_width':0.0005}}
N02BE= {'series':'N02BE','params_grid':{'changepoint_prior_scale':10,'seasonality_prior_scale':170,'interval_width':0.0005}}
N05B= {'series':'N05B','params_grid':{'changepoint_prior_scale':5,'interval_width':0.0005}}
N05C= {'series':'N05C','params_grid':{'changepoint_prior_scale':0.5,'interval_width':0.005}}
R03= {'series':'R03','params_grid':{'changepoint_prior_scale':0.05,'seasonality_prior_scale':160,'interval_width':0.0005}}
R06= {'series':'R06','params_grid':{'changepoint_prior_scale':0.05,'seasonality_prior_scale':120,'interval_width':0.0005}}
r=[M01AB,M01AE,N02BA,N02BE,N05B,N05C,R03,R06]
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
dfg=df[['datum',x['series']]]
dfg = dfg.rename(columns={'datum': 'ds', x['series']: 'y'})
size = len(dfg) - 50
dfgtrain=dfg.loc[0:size,:]
dfgtest=dfg.loc[size+1:len(dfg),:]
history = dfgtrain.copy()
predictions = list()
for t in dfgtest['ds'].values:
model = fbprophet.Prophet(changepoint_prior_scale=x['params_grid']['changepoint_prior_scale'],
growth='linear',
interval_width=x['params_grid']['interval_width'],
daily_seasonality=False,
weekly_seasonality=False
)
if(x['series']=='N02BE' or x['series']=='R03' or x['series']=='R06'):
model=model.add_seasonality(
name='yearly',
period=365.25,
prior_scale=x['params_grid']['seasonality_prior_scale'],
fourier_order=13)
model_fit = model.fit(history)
future = model.make_future_dataframe(periods=1, freq='W')
output = model.predict(future)
yhat = output.loc[output.ds==t]['yhat'].values[0]
predictions.append(yhat)
obs = dfgtest.loc[dfgtest.ds==t]['y'].values[0]
dd=pd.DataFrame([[t,obs]],columns=['ds','y'])
history=history.append(dd)
error = mean_squared_error(dfgtest['y'].values, predictions)
perror = mean_absolute_percentage_error(dfgtest['y'].values, predictions)
resultsRollingdf.loc['Prophet MSE',x['series']]=error
resultsRollingdf.loc['Prophet MAPE',x['series']]=perror
ax[rowindex,colindex].set_title(x['series']+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(dfgtest['y'].values)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
M01AB= {'series':'M01AB','params_grid':{'changepoint_prior_scale':30,'interval_width':0.0005}}
M01AE= {'series':'M01AE','params_grid':{'changepoint_prior_scale':0.05,'interval_width':0.0005}}
N02BA= {'series':'N02BA','params_grid':{'changepoint_prior_scale':0.005,'interval_width':0.0005}}
N02BE= {'series':'N02BE','params_grid':{'changepoint_prior_scale':10,'seasonality_prior_scale':170,'interval_width':0.0005}}
N05B= {'series':'N05B','params_grid':{'changepoint_prior_scale':5,'interval_width':0.0005}}
N05C= {'series':'N05C','params_grid':{'changepoint_prior_scale':0.5,'interval_width':0.005}}
R03= {'series':'R03','params_grid':{'changepoint_prior_scale':0.05,'seasonality_prior_scale':160,'interval_width':0.0005}}
R06= {'series':'R06','params_grid':{'changepoint_prior_scale':0.05,'seasonality_prior_scale':120,'interval_width':0.0005}}
r=[M01AB,M01AE,N02BA,N02BE,N05B,N05C,R03,R06]
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
dfg=df[['datum',x['series']]]
dfg = dfg.rename(columns={'datum': 'ds', x['series']: 'y'})
size = int(len(dfg) - 50)
dfgtrain=dfg.loc[0:size,:]
dfgtest=dfg.loc[size+1:len(dfg),:]
predictions = list()
model = fbprophet.Prophet(changepoint_prior_scale=x['params_grid']['changepoint_prior_scale'],
growth='linear', interval_width=x['params_grid']['interval_width'],
daily_seasonality=False,
weekly_seasonality=False
)
if(x['series']=='N02BE' or x['series']=='R03' or x['series']=='R06'):
model=model.add_seasonality(
name='yearly',
period=365.25,
prior_scale=x['params_grid']['seasonality_prior_scale'],
fourier_order=13)
model_fit = model.fit(dfgtrain)
future = model.make_future_dataframe(periods=50, freq='W')
output = model.predict(future)
predictions=output.loc[size+2:len(dfg),:]['yhat'].values
error = mean_squared_error(dfgtest['y'].values, predictions)
perror = mean_absolute_percentage_error(dfgtest['y'].values, predictions)
resultsLongtermdf.loc['Prophet MSE',x['series']]=error
resultsLongtermdf.loc['Prophet MAPE',x['series']]=perror
ax[rowindex,colindex].set_title(x['series']+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(dfgtest['y'].values)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
Long-term forecasting validation has been done with three LSTM configurations: Vanilla LSTM, Stacked LSTM and Bi-directional LSTM. Relu activation function was used, optimizer was Adam and loss function was Mean Squared Error. The best results were achieved with training the model in 400 epochs. Before fitting, all data was standardized (rescaled in interval -1,1) and transformed to data for supervised problem.
Number of past observations tested in input sequences was either 10 or 5. For series with larger variances and randomness (N05B and N05C) and simpler, Vanilla LSTM model, 10 past observations produced better forecasting accuracy. In all other cases, 5 past observations were used. This is the parameter that has been adopted.
In order to get reproducible results in forecasting with LSTM, following values are fixed: seed value, 'PYTHONHASHSEED' environment variable, Python's, numpy's and Tensorflow's built-in pseudo-random generators. A new global Tensorflow session is configured.
seed_value= 0
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
import random
random.seed(seed_value)
import numpy as np
np.random.seed(seed_value)
import tensorflow as tf
tf.random.set_seed(seed_value)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)
Components required for forecasting with LSTM are then imported. Function split_sequence for transforming time-series data to data for supervised machine learning problem is provided, including constants, such as number of previous steps to take into account.
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Bidirectional
from sklearn.preprocessing import MinMaxScaler
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
end_ix = i + n_steps
if end_ix > len(sequence)-1:
break
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
size = int(len(df) - 50)
n_steps=5
n_features = 1
Using TensorFlow backend.
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
r=['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
scaler = MinMaxScaler(feature_range = (0, 1))
X=scaler.fit_transform(X.reshape(-1, 1))
X_train,y_train=split_sequence(X[0:size], n_steps)
X_test,y_test=split_sequence(X[size:len(df)], n_steps)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], n_features))
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=400, verbose=0)
X_test = X_test.reshape((len(X_test), n_steps, n_features))
predictions = model.predict(X_test, verbose=0)
y_test=scaler.inverse_transform(y_test)
predictions = scaler.inverse_transform(predictions)
error = mean_squared_error(y_test, predictions)
perror = mean_absolute_percentage_error(y_test, predictions)
resultsLongtermdf.loc['Vanilla LSTM MSE',x]=error
resultsLongtermdf.loc['Vanilla LSTM MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(y_test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
r=['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
scaler = MinMaxScaler(feature_range = (0, 1))
X=scaler.fit_transform(X.reshape(-1, 1))
X_train,y_train=split_sequence(X[0:size], n_steps)
X_test,y_test=split_sequence(X[size:len(df)], n_steps)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], n_features))
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=400, verbose=0)
X_test = X_test.reshape((len(X_test), n_steps, n_features))
predictions = model.predict(X_test, verbose=0)
y_test=scaler.inverse_transform(y_test)
predictions = scaler.inverse_transform(predictions)
error = mean_squared_error(y_test, predictions)
perror = mean_absolute_percentage_error(y_test, predictions)
resultsLongtermdf.loc['Stacked LSTM MSE',x]=error
resultsLongtermdf.loc['Stacked LSTM MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(y_test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
df = pd.read_csv('/content/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
r=['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
scaler = MinMaxScaler(feature_range = (0, 1))
X=scaler.fit_transform(X.reshape(-1, 1))
X_train,y_train=split_sequence(X[0:size], n_steps)
X_test,y_test=split_sequence(X[size:len(df)], n_steps)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], n_features))
model = Sequential()
model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=400, verbose=0)
X_test = X_test.reshape((len(X_test), n_steps, n_features))
predictions = model.predict(X_test, verbose=0)
y_test=scaler.inverse_transform(y_test)
predictions = scaler.inverse_transform(predictions)
error = mean_squared_error(y_test, predictions)
perror = mean_absolute_percentage_error(y_test, predictions)
resultsLongtermdf.loc['Bidirectional LSTM MSE',x]=error
resultsLongtermdf.loc['Bidirectional LSTM MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(y_test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
For rolling forecast, ARIMA method (Auto-ARIMA for series with seasonal character) outperforms Prophet and is considered as a best candidate for short-term sales forecasting. All methods in all cases (with exception of Prophet N02BE) outperform reference benchmarks - Naive and Seasonal Naive forecasts.
from IPython.display import display, HTML
display(HTML(resultsRollingdf.to_html()))
M01AB | M01AE | N02BA | N02BE | N05B | N05C | R03 | R06 | |
---|---|---|---|---|---|---|---|---|
Naive MSE | 116.014866 | 93.875126 | 44.741150 | 2753.643864 | 255.485600 | 14.920000 | 948.560347 | 82.228700 |
Naive MAPE | 28.765667 | 36.202952 | 33.475848 | 21.451644 | 24.308687 | inf | 47.432638 | 52.094536 |
Seasonal Naive MSE | 0.000000 | 0.000000 | 0.000000 | 2923.370345 | 0.000000 | 0.000000 | 1141.258975 | 87.288057 |
Seasonal Naive MAPE | 0.000000 | 0.000000 | 0.000000 | 19.976880 | 0.000000 | 0.000000 | 48.845563 | 52.491895 |
ARIMA MSE | 74.768445 | 76.572892 | 31.942950 | 2614.392094 | 151.322517 | 7.949828 | 678.336899 | 75.030664 |
ARIMA MAPE | 22.672768 | 36.261902 | 29.489328 | 44.809616 | 21.372122 | inf | 78.656105 | 113.282567 |
AutoARIMA MSE | 76.868212 | 79.386964 | 0.000000 | 0.000000 | 145.608244 | 7.949828 | 0.000000 | 0.000000 |
AutoARIMA MAPE | 24.118497 | 33.544141 | 0.000000 | 0.000000 | 18.394689 | inf | 0.000000 | 0.000000 |
Prophet MSE | 84.485525 | 80.414991 | 33.075161 | 3024.603235 | 187.198474 | 8.574808 | 817.871777 | 81.267203 |
Prophet MAPE | 26.529136 | 32.405102 | 29.301263 | 21.122458 | 21.985744 | inf | 62.824113 | 67.101435 |
For long-term forecasting, Prophet and LSTM models have shown significantly better performance then benchmarks in the time series with seasonality. In all cases, even without optimization, LSTM models have achieved competitive performance.
from IPython.display import display, HTML
display(HTML(resultsLongtermdf.to_html()))
M01AB | M01AE | N02BA | N02BE | N05B | N05C | R03 | R06 | |
---|---|---|---|---|---|---|---|---|
Average MSE | 75.428945 | 99.086133 | 76.460997 | 6837.269559 | 162.050944 | 7.961872 | 1191.186483 | 186.661004 |
Average MAPE | 22.727171 | 34.980731 | 51.351070 | 37.175660 | 21.443035 | inf | 50.243843 | 86.645469 |
ARIMA MSE | 75.428945 | 98.405800 | 76.460997 | 6837.269559 | 162.050944 | 7.654475 | 1191.186483 | 162.095451 |
ARIMA MAPE | 22.727171 | 29.686087 | 51.351070 | 37.175660 | 21.443035 | inf | 50.243843 | 60.042695 |
AutoARIMA MSE | 77.638901 | 99.069527 | 49.550212 | 4154.325871 | 138.850593 | 7.974218 | 1297.635562 | 131.497475 |
AutoARIMA MAPE | 87.643432 | 81.471790 | 78.669882 | 97.628295 | 92.426139 | inf | 88.611067 | 81.761830 |
Prophet MSE | 69.638168 | 79.584883 | 32.268962 | 3101.475442 | 300.585886 | 8.372311 | 837.131033 | 76.724888 |
Prophet MAPE | 22.767282 | 30.760918 | 28.842703 | 23.109385 | 28.617807 | inf | 57.792436 | 70.661252 |
Vanilla LSTM MSE | 85.760039 | 98.658632 | 39.036062 | 2668.286251 | 162.504938 | 8.712420 | 868.451483 | 68.917992 |
Vanilla LSTM MAPE | 26.548283 | 37.842501 | 32.159034 | 21.565157 | 19.551021 | inf | 48.603815 | 65.831749 |
Stacked LSTM MSE | 83.580163 | 104.075344 | 37.526952 | 2607.023831 | 191.830512 | 9.700117 | 878.151787 | 67.128141 |
Stacked LSTM MAPE | 26.087991 | 37.742118 | 34.324011 | 19.765363 | 21.891923 | inf | 48.092074 | 64.549601 |
Bidirectional LSTM MSE | 86.470894 | 89.413755 | 36.357780 | 2675.948844 | 217.393555 | 9.791272 | 848.527206 | 73.450034 |
Bidirectional LSTM MAPE | 26.922596 | 36.660723 | 34.481220 | 19.602326 | 23.603980 | inf | 50.465062 | 68.238467 |
To conclude, time-series analyses and forecasts have guided potentially useful conclusions and recommendations to the pharmacy. Daily, weekly and annual seasonality analysis were proven useful for identifying the periods in which special sales and marketing campaigns could be implemented, except for N05B and N05C categories of drugs which did not exhibit significant regularities. Forecasts have proven better than Naïve methods and in acceptable intervals for long-term planning. It is highly likely that the forecasts could be significantly improved by expanding the problem scope to multivariate time series forecasting and by including explanatory variables, such as:
Future work on univariate time series forecasting includes increasing the number of data, exploring different other accuracy metrics, optimization of hyper-parameters for LSTM models and testing other architectures, such as CNN LSTM and ConvLSTM. However, key improvements in sales forecasting are expected from reducing the uncertainty of the models by expanding to multivariate time series forecasting problem, as explained above.