Overview¶

In the 10x series of notebooks, we will look at Time Series modeling in pycaret using univariate data and no exogenous variables. We will use the famous airline dataset for illustration. Our plan of action is as follows:

Perform EDA on the dataset to extract valuable insight about the process generating the time series. (COMPLETED)
Model the dataset based on exploratory analysis (univariable model without exogenous variables). (COMPLETED)
Use an automated approach (AutoML) to improve the performance. (Covered in this notebook)

In [1]:

# Only enable critical logging (Optional)
import os
os.environ["PYCARET_CUSTOM_LOGGING_LEVEL"] = "CRITICAL"

In [2]:

def what_is_installed():
    from pycaret import show_versions
    show_versions()

try:
    what_is_installed()
except ModuleNotFoundError:
    !pip install pycaret
    what_is_installed()

System:
    python: 3.9.16 (main, Jan 11 2023, 16:16:36) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_16p1\python.exe
   machine: Windows-10-10.0.19044-SP0

PyCaret required dependencies:
                 pip: 22.3.1
          setuptools: 65.6.3
             pycaret: 3.0.0
             IPython: 8.10.0
          ipywidgets: 8.0.4
                tqdm: 4.64.1
               numpy: 1.23.5
              pandas: 1.5.3
              jinja2: 3.1.2
               scipy: 1.10.0
              joblib: 1.2.0
             sklearn: 1.2.1
                pyod: 1.0.8
            imblearn: 0.10.1
   category_encoders: 2.6.0
            lightgbm: 3.3.5
               numba: 0.56.4
            requests: 2.28.2
          matplotlib: 3.7.0
          scikitplot: 0.3.7
         yellowbrick: 1.5
              plotly: 5.13.0
             kaleido: 0.2.1
         statsmodels: 0.13.5
              sktime: 0.16.1
               tbats: 1.1.2
            pmdarima: 2.0.2
              psutil: 5.9.4

PyCaret optional dependencies:
                shap: 0.41.0
           interpret: Not installed
                umap: Not installed
    pandas_profiling: Not installed
  explainerdashboard: Not installed
             autoviz: Not installed
           fairlearn: Not installed
             xgboost: Not installed
            catboost: Not installed
              kmodes: Not installed
             mlxtend: Not installed
       statsforecast: Not installed
        tune_sklearn: Not installed
                 ray: Not installed
            hyperopt: Not installed
              optuna: Not installed
               skopt: Not installed
              mlflow: 2.1.1
              gradio: Not installed
             fastapi: Not installed
             uvicorn: Not installed
              m2cgen: Not installed
           evidently: Not installed
               fugue: 0.8.0
           streamlit: Not installed
             prophet: 1.1.2

In [3]:

import time
import numpy as np
import pandas as pd

from pycaret.datasets import get_data
from pycaret.time_series import TSForecastingExperiment

In [4]:

y = get_data('airline', verbose=False)

In [5]:

# We want to forecast the next 12 months of data and we will use 3 fold cross-validation to test the models.
fh = 12 # or alternately fh = np.arange(1,13)
fold = 3

In [6]:

# Global Figure Settings for notebook ----
# Depending on whether you are using jupyter notebook, jupyter lab, Google Colab, you may have to set the renderer appropriately
# NOTE: Setting to a static renderer here so that the notebook saved size is reduced.
fig_kwargs = {
    # "renderer": "notebook",
    "renderer": "png",
    "width": 1000,
    "height": 600,
}

Auto Create¶

We have so many models to choose from. How do we know which ones perform the best. Let's see how we can do with pycaret.

In [7]:

exp = TSForecastingExperiment()
exp.setup(data=y, fh=fh, fold=fold, fig_kwargs=fig_kwargs, session_id=42)

	Description	Value
0	session_id	42
1	Target	Number of airline passengers
2	Approach	Univariate
3	Exogenous Variables	Not Present
4	Original data shape	(144, 1)
5	Transformed data shape	(144, 1)
6	Transformed train set shape	(132, 1)
7	Transformed test set shape	(12, 1)
8	Rows with missing values	0.0%
9	Fold Generator	ExpandingWindowSplitter
10	Fold Number	3
11	Enforce Prediction Interval	False
12	Splits used for hyperparameters	all
13	User Defined Seasonal Period(s)	None
14	Ignore Seasonality Test	False
15	Seasonality Detection Algo	auto
16	Max Period to Consider	60
17	Seasonal Period(s) Tested	[12, 24, 36, 11, 48]
18	Significant Seasonal Period(s)	[12, 24, 36, 11, 48]
19	Significant Seasonal Period(s) without Harmonics	[48, 36, 11]
20	Remove Harmonics	False
21	Harmonics Order Method	harmonic_max
22	Num Seasonalities to Use	1
23	All Seasonalities to Use	[12]
24	Primary Seasonality	12
25	Seasonality Present	True
26	Seasonality Type	mul
27	Target Strictly Positive	True
28	Target White Noise	No
29	Recommended d	1
30	Recommended Seasonal D	1
31	Preprocess	False
32	CPU Jobs	-1
33	Use GPU	False
34	Log Experiment	False
35	Experiment Name	ts-default-name
36	USI	8a65

Out[7]:

<pycaret.time_series.forecasting.oop.TSForecastingExperiment at 0x27f75438310>

Compare Models¶

In [8]:

# Get the 3 best baseline models 
best_baseline_models = exp.compare_models(n_select=3)
best_baseline_models

	Model	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2	TT (Sec)
exp_smooth	Exponential Smoothing	0.5852	0.6105	17.1926	20.1633	0.0435	0.0439	0.8918	0.1100
ets	ETS	0.5931	0.6212	17.4165	20.5103	0.0440	0.0445	0.8882	0.1533
et_cds_dt	Extra Trees w/ Cond. Deseasonalize & Detrending	0.6666	0.7255	19.6620	24.0121	0.0490	0.0489	0.8465	0.5033
huber_cds_dt	Huber w/ Cond. Deseasonalize & Detrending	0.6813	0.7866	20.0334	25.9670	0.0491	0.0499	0.8113	0.4100
arima	ARIMA	0.6830	0.6735	20.0069	22.2199	0.0501	0.0507	0.8677	0.0767
lr_cds_dt	Linear w/ Cond. Deseasonalize & Detrending	0.7004	0.7702	20.6084	25.4401	0.0509	0.0514	0.8215	0.6733
ridge_cds_dt	Ridge w/ Cond. Deseasonalize & Detrending	0.7004	0.7703	20.6086	25.4405	0.0509	0.0514	0.8215	0.6367
en_cds_dt	Elastic Net w/ Cond. Deseasonalize & Detrending	0.7029	0.7732	20.6816	25.5362	0.0511	0.0516	0.8201	0.6833
lasso_cds_dt	Lasso w/ Cond. Deseasonalize & Detrending	0.7048	0.7751	20.7373	25.6005	0.0512	0.0517	0.8193	0.4000
llar_cds_dt	Lasso Least Angular Regressor w/ Cond. Deseasonalize & Detrending	0.7048	0.7751	20.7366	25.6009	0.0512	0.0517	0.8192	0.3867
br_cds_dt	Bayesian Ridge w/ Cond. Deseasonalize & Detrending	0.7112	0.7837	20.9213	25.8795	0.0515	0.0521	0.8144	0.3367
stlf	STLF	0.7133	0.7059	21.0007	23.3522	0.0530	0.0531	0.8527	0.0367
knn_cds_dt	K Neighbors w/ Cond. Deseasonalize & Detrending	0.7162	0.8157	21.1613	26.9700	0.0521	0.0529	0.7811	0.2867
auto_arima	Auto ARIMA	0.7181	0.7114	21.0297	23.4661	0.0525	0.0531	0.8509	2.6633
rf_cds_dt	Random Forest w/ Cond. Deseasonalize & Detrending	0.7848	0.8887	23.0593	29.3081	0.0553	0.0565	0.7655	0.4600
gbr_cds_dt	Gradient Boosting w/ Cond. Deseasonalize & Detrending	0.7892	0.9185	23.2026	30.3044	0.0563	0.0571	0.7515	0.3000
dt_cds_dt	Decision Tree w/ Cond. Deseasonalize & Detrending	0.7930	0.8759	23.3852	29.0221	0.0566	0.0571	0.7698	0.2467
ada_cds_dt	AdaBoost w/ Cond. Deseasonalize & Detrending	0.8058	0.9723	23.7210	32.0863	0.0565	0.0582	0.7012	0.3300
lightgbm_cds_dt	Light Gradient Boosting w/ Cond. Deseasonalize & Detrending	0.8156	0.9117	24.0002	30.0956	0.0575	0.0587	0.7561	0.4233
theta	Theta Forecaster	0.9729	1.0306	28.3192	33.8639	0.0670	0.0700	0.6710	0.0333
omp_cds_dt	Orthogonal Matching Pursuit w/ Cond. Deseasonalize & Detrending	1.0090	1.2370	29.6294	40.8121	0.0685	0.0718	0.5462	0.3500
snaive	Seasonal Naive Forecaster	1.1479	1.0945	33.3611	35.9139	0.0832	0.0879	0.6072	1.3700
polytrend	Polynomial Trend Forecaster	1.6523	1.9202	48.6301	63.4299	0.1170	0.1216	-0.0784	0.0167
croston	Croston	1.9311	2.3517	56.6180	77.5856	0.1295	0.1439	-0.6281	0.0200
naive	Naive Forecaster	2.3599	2.7612	69.0278	91.0322	0.1569	0.1792	-1.2216	2.4633
grand_means	Grand Means Forecaster	5.5306	5.2596	162.4117	173.6492	0.4000	0.5075	-7.0462	1.5933

Out[8]:

[ExponentialSmoothing(seasonal='mul', sp=12, trend='add'),
 AutoETS(seasonal='mul', sp=12, trend='add'),
 BaseCdsDtForecaster(fe_target_rr=[WindowSummarizer(lag_feature={'lag': [12, 11,
                                                                         10, 9,
                                                                         8, 7, 6,
                                                                         5, 4, 3,
                                                                         2, 1]},
                                                    n_jobs=1)],
                     regressor=ExtraTreesRegressor(n_jobs=-1, random_state=42),
                     sp=12, window_length=12)]

In [9]:

# We will save the metrics to be used in a later step.
compare_metrics = exp.pull()
# compare_metrics

Note that some models like BATS and TBATS are disabled by default.
You can enable them by setting turbo = False

best_baseline_models = exp.compare_models(n_select=3, turbo=False)

Tune Best Models¶

In [10]:

best_tuned_models = [exp.tune_model(model) for model in best_baseline_models]
best_tuned_models

	cutoff	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2
0	1956-12	0.3617	0.4124	10.5620	13.4978	0.0272	0.0273	0.9407
1	1957-12	0.8588	0.8856	26.2572	30.0651	0.0738	0.0703	0.7632
2	1958-12	0.3942	0.4126	11.2644	13.4112	0.0261	0.0265	0.9598
Mean	NaT	0.5382	0.5702	16.0279	18.9914	0.0424	0.0414	0.8879
SD	NaT	0.2271	0.2230	7.2389	7.8304	0.0222	0.0205	0.0885

Fitting 3 folds for each of 10 candidates, totalling 30 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  30 out of  30 | elapsed:    1.7s finished

	cutoff	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2
0	1956-12	0.3641	0.3969	10.6335	12.9923	0.0280	0.0281	0.9451
1	1957-12	1.0557	1.0278	32.2760	34.8945	0.0900	0.0854	0.6810
2	1958-12	0.4579	0.4697	13.0839	15.2673	0.0296	0.0302	0.9479
Mean	NaT	0.6259	0.6315	18.6645	21.0513	0.0492	0.0479	0.8580
SD	NaT	0.3063	0.2818	9.6767	9.8325	0.0288	0.0265	0.1251

Fitting 3 folds for each of 10 candidates, totalling 30 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  30 out of  30 | elapsed:    2.3s finished

	cutoff	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2
0	1956-12	1.0549	1.2622	30.8068	41.3148	0.0755	0.0801	0.4444
1	1957-12	0.7417	0.8045	22.6769	27.3108	0.0589	0.0582	0.8046
2	1958-12	0.7252	0.8562	20.7230	27.8322	0.0448	0.0460	0.8267
Mean	NaT	0.8406	0.9743	24.7356	32.1526	0.0597	0.0614	0.6919
SD	NaT	0.1517	0.2047	4.3665	6.4822	0.0125	0.0141	0.1753

Fitting 3 folds for each of 10 candidates, totalling 30 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  30 out of  30 | elapsed:   10.7s finished

Out[10]:

[ExponentialSmoothing(seasonal='add', sp=12, trend='add', use_boxcox=True),
 AutoETS(seasonal='mul', sp=12, trend='add'),
 BaseCdsDtForecaster(fe_target_rr=[WindowSummarizer(lag_feature={'lag': [12, 11,
                                                                         10, 9,
                                                                         8, 7, 6,
                                                                         5, 4, 3,
                                                                         2, 1]},
                                                    n_jobs=1)],
                     regressor=ExtraTreesRegressor(n_jobs=-1, random_state=42),
                     sp=12, window_length=12)]

Blend Best Models¶

We can achieve even better results sometimes if we combine results of several good models. This can be achieved using the blend_model functionality. There are several options available to blend such as the mean, gmean median, min, max. In addition, weights can be applied to the forecasts from the base learners. This is useful when we want to give more importance (weight) to models with a lower error for example. Please refer to the blend_models docstring for more information about the blending functionality.

In [11]:

# help(exp.blend_models)

Let's see the voting blender in action.

In [12]:

# Get model weights to use
top_model_metrics = compare_metrics.iloc[0:3]['MAE']
display(top_model_metrics)

top_model_weights = 1 - top_model_metrics/top_model_metrics.sum()
display(top_model_weights)

exp_smooth    17.1926
ets           17.4165
et_cds_dt      19.662
Name: MAE, dtype: object

exp_smooth    0.683209
ets           0.679083
et_cds_dt     0.637708
Name: MAE, dtype: object

In [13]:

blender = exp.blend_models(best_tuned_models, method='mean', weights=top_model_weights.values.tolist())

	cutoff	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2
0	1956-12	0.3703	0.4756	10.8141	15.5674	0.0264	0.0269	0.9211
1	1957-12	0.7174	0.7209	21.9328	24.4724	0.0609	0.0587	0.8431
2	1958-12	0.5437	0.5535	15.5368	17.9920	0.0353	0.0362	0.9276
Mean	NaT	0.5438	0.5833	16.0946	19.3439	0.0409	0.0406	0.8973
SD	NaT	0.1417	0.1023	4.5563	3.7591	0.0146	0.0133	0.0384

In [14]:

y_predict = exp.predict_model(blender)
print(y_predict)
exp.plot_model(estimator=blender)

	Model	MASE	RMSSE	MAE	RMSE	MAPE	SMAPE	R2
0	EnsembleForecaster	0.3310	0.3904	10.0804	13.4888	0.0219	0.0217	0.9672

           y_pred
1960-01  411.8049
1960-02  392.7194
1960-03  453.3590
1960-04  443.1362
1960-05  464.3702
1960-06  531.1468
1960-07  606.6252
1960-08  614.9484
1960-09  508.5177
1960-10  451.2852
1960-11  402.6351
1960-12  435.1540

Finalize Model¶

In [15]:

final_model = exp.finalize_model(blender)
print(exp.predict_model(final_model))
exp.plot_model(final_model)

           y_pred
1961-01  443.5568
1961-02  420.8623
1961-03  468.9008
1961-04  491.6614
1961-05  507.2296
1961-06  573.9883
1961-07  658.7005
1961-08  653.2689
1961-09  549.4467
1961-10  496.5341
1961-11  427.8801
1961-12  467.9969

Save and Load Model¶

In [16]:

_ = exp.save_model(final_model, "my_blender")

Transformation Pipeline and Model Successfully Saved

In [17]:

loaded_exp = TSForecastingExperiment()
m = loaded_exp.load_model("my_blender")
# Predictions should be same as before the model was saved and loaded
loaded_exp.predict_model(m)

Transformation Pipeline and Model Successfully Loaded

Out[17]:

	y_pred
1961-01	443.5568
1961-02	420.8623
1961-03	468.9008
1961-04	491.6614
1961-05	507.2296
1961-06	573.9883
1961-07	658.7005
1961-08	653.2689
1961-09	549.4467
1961-10	496.5341
1961-11	427.8801
1961-12	467.9969

That's it for this notebook. Users can hopefully see how easy it was to use the automated approach to model this data and we were able to achieve reasonale results on par with (or even better than) the manual approach.