In the 10x series of notebooks, we will look at Time Series modeling in pycaret using univariate data and no exogenous variables. We will use the famous airline dataset for illustration. Our plan of action is as follows:
# Only enable critical logging (Optional)
import os
os.environ["PYCARET_CUSTOM_LOGGING_LEVEL"] = "CRITICAL"
def what_is_installed():
from pycaret import show_versions
show_versions()
try:
what_is_installed()
except ModuleNotFoundError:
!pip install pycaret
what_is_installed()
System: python: 3.8.13 (default, Mar 28 2022, 06:59:08) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_0p11_2\python.exe machine: Windows-10-10.0.19044-SP0 PyCaret required dependencies:
C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_0p11_2\lib\site-packages\_distutils_hack\__init__.py:30: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
pip: 21.2.2 setuptools: 61.2.0 pycaret: 3.0.0 ipython: Not installed ipywidgets: 7.7.0 numpy: 1.21.6 pandas: 1.4.2 jinja2: 3.1.2 scipy: 1.8.0 joblib: 1.1.0 sklearn: 1.0.2 pyod: Installed but version unavailable imblearn: 0.9.0 category_encoders: 2.4.1 lightgbm: 3.3.2 numba: 0.55.1 requests: 2.27.1 matplotlib: 3.5.2 scikitplot: 0.3.7 yellowbrick: 1.4 plotly: 5.8.0 kaleido: 0.2.1 statsmodels: 0.13.2 sktime: 0.11.4 tbats: Installed but version unavailable pmdarima: 1.8.5 PyCaret optional dependencies: shap: Not installed interpret: Not installed umap: Not installed pandas_profiling: Not installed explainerdashboard: Not installed autoviz: Not installed fairlearn: Not installed xgboost: Not installed catboost: Not installed kmodes: Not installed mlxtend: Not installed statsforecast: 0.5.5 tune_sklearn: Not installed ray: Not installed hyperopt: Not installed optuna: Not installed skopt: Not installed mlflow: 1.25.1 gradio: Not installed fastapi: Not installed uvicorn: Not installed m2cgen: Not installed evidently: Not installed nltk: Not installed pyLDAvis: Not installed gensim: Not installed spacy: Not installed wordcloud: Not installed textblob: Not installed psutil: 5.9.0 fugue: Not installed streamlit: Not installed prophet: Not installed
import time
import numpy as np
import pandas as pd
from pycaret.datasets import get_data
from pycaret.time_series import TSForecastingExperiment
y = get_data('airline', verbose=False)
# We want to forecast the next 12 months of data and we will use 3 fold cross-validation to test the models.
fh = 12 # or alternately fh = np.arange(1,13)
fold = 3
# Global Plot Settings
fig_kwargs={'renderer': 'notebook'}
pycaret
Time Series Forecasting module provides a conventient interface for perform exploratory analysis using plot_model
.
NOTE:
plot_model
will plot using the original dataset. We will cover this in the current notebook.plot_model
, the the plots are made using the model data (e.g. future forecasts, or analysis on insample residuals). We will cover this in a subsequent notebook.Let's see how this works next.
First, we will plots the original dataset.
eda = TSForecastingExperiment()
eda.setup(data=y, fh=fh, fig_kwargs=fig_kwargs)
Description | Value | |
---|---|---|
0 | session_id | 7961 |
1 | Target | Number of airline passengers |
2 | Approach | Univariate |
3 | Exogenous Variables | Not Present |
4 | Original data shape | (144, 1) |
5 | Transformed data shape | (144, 1) |
6 | Transformed train set shape | (132, 1) |
7 | Transformed test set shape | (12, 1) |
8 | Rows with missing values | 0.0% |
9 | Fold Generator | ExpandingWindowSplitter |
10 | Fold Number | 3 |
11 | Enforce Prediction Interval | False |
12 | Seasonal Period(s) Tested | 12 |
13 | Seasonality Present | True |
14 | Seasonalities Detected | [12] |
15 | Primary Seasonality | 12 |
16 | Target Strictly Positive | True |
17 | Target White Noise | No |
18 | Recommended d | 1 |
19 | Recommended Seasonal D | 1 |
20 | Preprocess | False |
21 | CPU Jobs | -1 |
22 | Use GPU | False |
23 | Log Experiment | False |
24 | Experiment Name | ts-default-name |
25 | USI | 8a03 |
<pycaret.time_series.forecasting.oop.TSForecastingExperiment at 0x170810b23d0>
# NOTE: This is the same as `eda.plot_model(plot="ts")`
eda.plot_model()