pmdarima
for auto ARIMA¶Playing around with the pmdarima
library for basic ARIMA and ARIMAX models in python - a nice alternative to statsmodels
. Time series modeling in Python in general seems very scattered compared to the experience in R
unfortunately.
From pmdarima
toy datasets: https://alkaline-ml.com/pmdarima/modules/datasets.html
import numpy as np
import pandas as pd
import pmdarima as pm
from pmdarima import model_selection
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
# Load a local helper file with the data cleaning and all that
from create_dataset import load_combined_dataset
Using the sample datasets from the pmdarima
library we can create a combined dataframe with data from 1980 to 1994 covering Australian beer production (in megaliters), residents, and wine sales (in bottles)
df = load_combined_dataset()
df.head()
timeperiod | Wine | Residents | Beer | |
---|---|---|---|---|
0 | 19801 | 51885.0 | 14515.7 | 513.0 |
1 | 19802 | 54954.0 | 14554.9 | 427.0 |
2 | 19803 | 67765.0 | 14602.5 | 473.0 |
3 | 19804 | 79117.0 | 14646.4 | 526.0 |
4 | 19811 | 53013.0 | 14695.4 | 548.0 |
fig = go.Figure()
fig.add_trace(go.Scatter(x=df.index, y=df.Wine, name="Wine Sales (bottles)"))
fig.add_trace(go.Scatter(x=df.index, y=df.Residents, name="Residents"))
fig.update_layout(title="Wine Sales and Residents Over Time",
xaxis_title="Quarters since Q1 1980", yaxis_title="Quarterly Total")