In the 10x series of notebooks, we will look at Time Series modeling in pycaret using univariate data and no exogenous variables. We will use the famous airline dataset for illustration. Our plan of action is as follows:
# Only enable critical logging (Optional)
import os
os.environ["PYCARET_CUSTOM_LOGGING_LEVEL"] = "CRITICAL"
def what_is_installed():
from pycaret import show_versions
show_versions()
try:
what_is_installed()
except ModuleNotFoundError:
!pip install pycaret
what_is_installed()
System: python: 3.9.16 (main, Jan 11 2023, 16:16:36) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_16p1\python.exe machine: Windows-10-10.0.19044-SP0 PyCaret required dependencies: pip: 22.3.1 setuptools: 65.6.3 pycaret: 3.0.0rc9 IPython: 8.10.0 ipywidgets: 8.0.4 tqdm: 4.64.1 numpy: 1.23.5 pandas: 1.5.3 jinja2: 3.1.2 scipy: 1.10.0 joblib: 1.2.0 sklearn: 1.2.1 pyod: 1.0.8 imblearn: 0.10.1 category_encoders: 2.6.0 lightgbm: 3.3.5 numba: 0.56.4 requests: 2.28.2 matplotlib: 3.7.0 scikitplot: 0.3.7 yellowbrick: 1.5 plotly: 5.13.0 kaleido: 0.2.1 statsmodels: 0.13.5 sktime: 0.16.1 tbats: 1.1.2 pmdarima: 2.0.2 psutil: 5.9.4 PyCaret optional dependencies: shap: 0.41.0 interpret: Not installed umap: Not installed pandas_profiling: Not installed explainerdashboard: Not installed autoviz: Not installed fairlearn: Not installed xgboost: Not installed catboost: Not installed kmodes: Not installed mlxtend: Not installed statsforecast: Not installed tune_sklearn: Not installed ray: Not installed hyperopt: Not installed optuna: Not installed skopt: Not installed mlflow: 2.1.1 gradio: Not installed fastapi: Not installed uvicorn: Not installed m2cgen: Not installed evidently: Not installed fugue: 0.8.0 streamlit: Not installed prophet: 1.1.2
import time
import numpy as np
import pandas as pd
from pycaret.datasets import get_data
from pycaret.time_series import TSForecastingExperiment
y = get_data('airline', verbose=False)
# We want to forecast the next 12 months of data and we will use 3 fold cross-validation to test the models.
fh = 12 # or alternately fh = np.arange(1,13)
fold = 3
# Global Figure Settings for notebook ----
# Depending on whether you are using jupyter notebook, jupyter lab, Google Colab, you may have to set the renderer appropriately
# NOTE: Setting to a static renderer here so that the notebook saved size is reduced.
fig_kwargs = {
# "renderer": "notebook",
"renderer": "png",
"width": 1000,
"height": 600,
}
exp = TSForecastingExperiment()
exp.setup(data=y, fh=fh, fig_kwargs=fig_kwargs)
Description | Value | |
---|---|---|
0 | session_id | 4747 |
1 | Target | Number of airline passengers |
2 | Approach | Univariate |
3 | Exogenous Variables | Not Present |
4 | Original data shape | (144, 1) |
5 | Transformed data shape | (144, 1) |
6 | Transformed train set shape | (132, 1) |
7 | Transformed test set shape | (12, 1) |
8 | Rows with missing values | 0.0% |
9 | Fold Generator | ExpandingWindowSplitter |
10 | Fold Number | 3 |
11 | Enforce Prediction Interval | False |
12 | Splits used for hyperparameters | all |
13 | User Defined Seasonal Period(s) | None |
14 | Ignore Seasonality Test | False |
15 | Seasonality Detection Algo | auto |
16 | Max Period to Consider | 60 |
17 | Seasonal Period(s) Tested | [12, 24, 36, 11, 48] |
18 | Significant Seasonal Period(s) | [12, 24, 36, 11, 48] |
19 | Significant Seasonal Period(s) without Harmonics | [48, 36, 11] |
20 | Remove Harmonics | False |
21 | Harmonics Order Method | harmonic_max |
22 | Num Seasonalities to Use | 1 |
23 | All Seasonalities to Use | [12] |
24 | Primary Seasonality | 12 |
25 | Seasonality Present | True |
26 | Target Strictly Positive | True |
27 | Target White Noise | No |
28 | Recommended d | 1 |
29 | Recommended Seasonal D | 1 |
30 | Preprocess | False |
31 | CPU Jobs | -1 |
32 | Use GPU | False |
33 | Log Experiment | False |
34 | Experiment Name | ts-default-name |
35 | USI | a097 |
<pycaret.time_series.forecasting.oop.TSForecastingExperiment at 0x1167d1d60d0>
pycaret
Time Series Forecasting module has a rich set of models ranging from traditional statistical models such as ARIMA, Exponential Smoothing, ETS, etc to Reduced Regression Models
exp.models()
Name | Reference | Turbo | |
---|---|---|---|
ID | |||
naive | Naive Forecaster | sktime.forecasting.naive.NaiveForecaster | True |
grand_means | Grand Means Forecaster | sktime.forecasting.naive.NaiveForecaster | True |
snaive | Seasonal Naive Forecaster | sktime.forecasting.naive.NaiveForecaster | True |
polytrend | Polynomial Trend Forecaster | sktime.forecasting.trend.PolynomialTrendForeca... | True |
arima | ARIMA | sktime.forecasting.arima.ARIMA | True |
auto_arima | Auto ARIMA | sktime.forecasting.arima.AutoARIMA | True |
exp_smooth | Exponential Smoothing | sktime.forecasting.exp_smoothing.ExponentialSm... | True |
croston | Croston | sktime.forecasting.croston.Croston | True |
ets | ETS | sktime.forecasting.ets.AutoETS | True |
theta | Theta Forecaster | sktime.forecasting.theta.ThetaForecaster | True |
tbats | TBATS | sktime.forecasting.tbats.TBATS | False |
bats | BATS | sktime.forecasting.bats.BATS | False |
prophet | Prophet | pycaret.containers.models.time_series.ProphetP... | False |
lr_cds_dt | Linear w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
en_cds_dt | Elastic Net w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
ridge_cds_dt | Ridge w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
lasso_cds_dt | Lasso w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
lar_cds_dt | Least Angular Regressor w/ Cond. Deseasonalize... | pycaret.containers.models.time_series.BaseCdsD... | True |
llar_cds_dt | Lasso Least Angular Regressor w/ Cond. Deseaso... | pycaret.containers.models.time_series.BaseCdsD... | True |
br_cds_dt | Bayesian Ridge w/ Cond. Deseasonalize & Detren... | pycaret.containers.models.time_series.BaseCdsD... | True |
huber_cds_dt | Huber w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
par_cds_dt | Passive Aggressive w/ Cond. Deseasonalize & De... | pycaret.containers.models.time_series.BaseCdsD... | True |
omp_cds_dt | Orthogonal Matching Pursuit w/ Cond. Deseasona... | pycaret.containers.models.time_series.BaseCdsD... | True |
knn_cds_dt | K Neighbors w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
dt_cds_dt | Decision Tree w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
rf_cds_dt | Random Forest w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
et_cds_dt | Extra Trees w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
gbr_cds_dt | Gradient Boosting w/ Cond. Deseasonalize & Det... | pycaret.containers.models.time_series.BaseCdsD... | True |
ada_cds_dt | AdaBoost w/ Cond. Deseasonalize & Detrending | pycaret.containers.models.time_series.BaseCdsD... | True |
lightgbm_cds_dt | Light Gradient Boosting w/ Cond. Deseasonalize... | pycaret.containers.models.time_series.BaseCdsD... | True |
In our exploratory analysis, we found that the characteristics of the data meant that we need to difference the data once and take a seasonal difference with period = 12. We also concluded that some autoregressive properties still need to be taken care of after doing this.
More specifically, if we were building an ARIMA model, we would start with ARIMA(1,1,0)x(0,1,0,12). Let's build this next.
exp = TSForecastingExperiment()
exp.setup(data=y, fh=fh, fold=fold, fig_kwargs=fig_kwargs, session_id=42)
Description | Value | |
---|---|---|
0 | session_id | 42 |
1 | Target | Number of airline passengers |
2 | Approach | Univariate |
3 | Exogenous Variables | Not Present |
4 | Original data shape | (144, 1) |
5 | Transformed data shape | (144, 1) |
6 | Transformed train set shape | (132, 1) |
7 | Transformed test set shape | (12, 1) |
8 | Rows with missing values | 0.0% |
9 | Fold Generator | ExpandingWindowSplitter |
10 | Fold Number | 3 |
11 | Enforce Prediction Interval | False |
12 | Splits used for hyperparameters | all |
13 | User Defined Seasonal Period(s) | None |
14 | Ignore Seasonality Test | False |
15 | Seasonality Detection Algo | auto |
16 | Max Period to Consider | 60 |
17 | Seasonal Period(s) Tested | [12, 24, 36, 11, 48] |
18 | Significant Seasonal Period(s) | [12, 24, 36, 11, 48] |
19 | Significant Seasonal Period(s) without Harmonics | [48, 36, 11] |
20 | Remove Harmonics | False |
21 | Harmonics Order Method | harmonic_max |
22 | Num Seasonalities to Use | 1 |
23 | All Seasonalities to Use | [12] |
24 | Primary Seasonality | 12 |
25 | Seasonality Present | True |
26 | Target Strictly Positive | True |
27 | Target White Noise | No |
28 | Recommended d | 1 |
29 | Recommended Seasonal D | 1 |
30 | Preprocess | False |
31 | CPU Jobs | -1 |
32 | Use GPU | False |
33 | Log Experiment | False |
34 | Experiment Name | ts-default-name |
35 | USI | f298 |
<pycaret.time_series.forecasting.oop.TSForecastingExperiment at 0x1167f935a60>
Note:
The setup
provides some useful information out of the box.
NOTE:
sktime
package.model = exp.create_model("arima", order=(1,1,0), seasonal_order=(0,1,0,12))
cutoff | MASE | RMSSE | MAE | RMSE | MAPE | SMAPE | R2 | |
---|---|---|---|---|---|---|---|---|
0 | 1956-12 | 0.3535 | 0.4103 | 10.3216 | 13.4315 | 0.0255 | 0.0260 | 0.9413 |
1 | 1957-12 | 0.6844 | 0.6853 | 20.9235 | 23.2653 | 0.0581 | 0.0560 | 0.8582 |
2 | 1958-12 | 1.5988 | 1.4673 | 45.6850 | 47.6955 | 0.1066 | 0.1132 | 0.4911 |
Mean | NaT | 0.8789 | 0.8543 | 25.6434 | 28.1308 | 0.0634 | 0.0651 | 0.7635 |
SD | NaT | 0.5267 | 0.4477 | 14.8178 | 14.4051 | 0.0333 | 0.0362 | 0.1956 |
NOTE:
create_model
will highlight the cross validation scores across the folds. The time cutoff for each fold is also displayes for convenience. Users may wish to correlate this cutoff with what they get from plot_model(plot="cv")
.
create_model
retrains the model on the entire dataset after performing cross validation. This allows us to check the performance of the model against the test set simply by using predict_model
# Out-of-sample Forecasts
y_predict = exp.predict_model(model)
y_predict
Model | MASE | RMSSE | MAE | RMSE | MAPE | SMAPE | R2 | |
---|---|---|---|---|---|---|---|---|
0 | ARIMA | 0.6999 | 0.7757 | 21.3121 | 26.7998 | 0.0480 | 0.0462 | 0.8703 |
y_pred | |
---|---|
1960-01 | 424.7154 |
1960-02 | 408.1599 |
1960-03 | 472.4447 |
1960-04 | 463.0139 |
1960-05 | 487.5134 |
1960-06 | 540.0299 |
1960-07 | 616.5423 |
1960-08 | 628.0557 |
1960-09 | 532.5688 |
1960-10 | 477.0820 |
1960-11 | 432.5952 |
1960-12 | 476.1084 |
The scores listed above are for the test set. We can see that the metrics are actually slightly better than the mean Cross validation score which implies that we have not overfit the model. More details about this can be found in this article.
In a previous notebook, we saw that plot_model
without an estimator argument works on the original dataset. In addition, by passing the model (estimator
) to the plot_model
call, we can plot model diagnostics as well.
# Plot the out-of-sample forecasts
exp.plot_model(estimator=model)
# # Alternately the following will plot the same thing.
# exp.plot_model(estimator=model, plot="forecast")
NOTE:
predict_model
is intelligent enough to understand the current state of the model (i.e. it is only trained using the train dataset).predict_model
automatically makes the true future predictons automatically.Next, let's check the goodness of fit using both diagnostic plots as well as statistical tests. Similar to plot_model, passing an estimator to the check_stats
call will perform the tests on the model residuals.
# Check Goodness of Fit
exp.check_stats(model)
Test | Test Name | Data | Property | Setting | Value | |
---|---|---|---|---|---|---|
0 | Summary | Statistics | Residual | Length | 131.0 | |
1 | Summary | Statistics | Residual | # Missing Values | 0.0 | |
2 | Summary | Statistics | Residual | Mean | -0.445207 | |
3 | Summary | Statistics | Residual | Median | -0.9606 | |
4 | Summary | Statistics | Residual | Standard Deviation | 11.759243 | |
5 | Summary | Statistics | Residual | Variance | 138.27979 | |
6 | Summary | Statistics | Residual | Kurtosis | 4.244741 | |
7 | Summary | Statistics | Residual | Skewness | -0.938657 | |
8 | Summary | Statistics | Residual | # Distinct Values | 127.0 | |
9 | White Noise | Ljung-Box | Residual | Test Statictic | {'alpha': 0.05, 'K': 24} | 21.29991 |
10 | White Noise | Ljung-Box | Residual | Test Statictic | {'alpha': 0.05, 'K': 48} | 43.239443 |
11 | White Noise | Ljung-Box | Residual | p-value | {'alpha': 0.05, 'K': 24} | 0.620976 |
12 | White Noise | Ljung-Box | Residual | p-value | {'alpha': 0.05, 'K': 48} | 0.667946 |
13 | White Noise | Ljung-Box | Residual | White Noise | {'alpha': 0.05, 'K': 24} | True |
14 | White Noise | Ljung-Box | Residual | White Noise | {'alpha': 0.05, 'K': 48} | True |
15 | Stationarity | ADF | Residual | Stationarity | {'alpha': 0.05} | True |
16 | Stationarity | ADF | Residual | p-value | {'alpha': 0.05} | 0.0 |
17 | Stationarity | ADF | Residual | Test Statistic | {'alpha': 0.05} | -11.577344 |
18 | Stationarity | ADF | Residual | Critical Value 1% | {'alpha': 0.05} | -3.481682 |
19 | Stationarity | ADF | Residual | Critical Value 5% | {'alpha': 0.05} | -2.884042 |
20 | Stationarity | ADF | Residual | Critical Value 10% | {'alpha': 0.05} | -2.57877 |
21 | Stationarity | KPSS | Residual | Trend Stationarity | {'alpha': 0.05} | True |
22 | Stationarity | KPSS | Residual | p-value | {'alpha': 0.05} | 0.1 |
23 | Stationarity | KPSS | Residual | Test Statistic | {'alpha': 0.05} | 0.031457 |
24 | Stationarity | KPSS | Residual | Critical Value 10% | {'alpha': 0.05} | 0.119 |
25 | Stationarity | KPSS | Residual | Critical Value 5% | {'alpha': 0.05} | 0.146 |
26 | Stationarity | KPSS | Residual | Critical Value 2.5% | {'alpha': 0.05} | 0.176 |
27 | Stationarity | KPSS | Residual | Critical Value 1% | {'alpha': 0.05} | 0.216 |
28 | Normality | Shapiro | Residual | Normality | {'alpha': 0.05} | False |
29 | Normality | Shapiro | Residual | p-value | {'alpha': 0.05} | 0.00003 |
Observations
This indicates that we have done a good job of extracting most of the signal from the time series data.
Next, we can plot the diagnostics on the residuals just like we did it on the original dataset.
exp.plot_model(model, plot='diagnostics', fig_kwargs={"height": 800, "width": 1000})
Observations
NOTE:
These plots can be obtained individually as well if needed using the following calls
exp.plot_model(model, plot='residuals')
exp.plot_model(model, plot='acf')
exp.plot_model(model, plot='pacf')
exp.plot_model(model, plot='periodogram')
Another useful plot is the insample
plot which shows the model fit to the actual data.
exp.plot_model(model, plot='insample')
We could also check the decomposition of the residuals to see if
exp.plot_model(model, plot="decomp")
exp.plot_model(model, plot="decomp_stl")