MiniRocket [1] transforms input time series using a small, fixed set of convolutional kernels. MiniRocket uses PPV pooling to compute a single feature for each of the resulting feature maps (i.e., the proportion of positive values). The transformed features are used to train a linear classifier.
Import example data, MiniRocket
, MiniRocketClassifier
, MiniRocketRegressor
,
RidgeClassifierCV
(scikit-learn), and numpy
.
You can use the MiniRocket
transform directly, in a pipeline, or in our baked in MiniRocketClassifier
or MiniRocketRegressor
.
Note: MiniRocket
is compiled by numba
on import. The compiled functions are
cached, so this should only happen once (i.e., the first time you import MiniRocket
).
# !pip install --upgrade numba
import numpy as np
from sklearn.linear_model import RidgeClassifierCV
from sklearn.preprocessing import StandardScaler
from aeon.classification.convolution_based import MiniRocketClassifier
from aeon.datasets import load_arrow_head # univariate dataset
from aeon.datasets import load_basic_motions # multivariate dataset
from aeon.regression.convolution_based import MiniRocketRegressor
from aeon.transformations.collection.convolution_based import MiniRocket
X_train, y_train = load_arrow_head(split="train")
minirocket = MiniRocket() # by default, MiniRocket uses ~10_000 kernels
minirocket.fit(X_train)
X_train_transform = minirocket.transform(X_train)
# test shape of transformed training data -> (n_cases, 9_996)
X_train_transform.shape
(36, 9996)
We suggest using RidgeClassifierCV
(scikit-learn) for smaller datasets (fewer than ~10,000 training examples), and using logistic regression trained using stochastic gradient descent for larger datasets.
Note: For larger datasets, this means integrating MiniRocket with stochastic gradient descent such that the transform is performed per minibatch, not simply substituting RidgeClassifierCV
for, e.g., LogisticRegression
.
Note: While the input time-series of MiniRocket is unscaled, the output features of MiniRocket may need to be adjusted for following models. E.g. for RidgeClassifierCV
, we scale the features using the sklearn StandardScaler.
scaler = StandardScaler(with_mean=False)
classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10))
X_train_scaled_transform = scaler.fit_transform(X_train_transform)
classifier.fit(X_train_scaled_transform, y_train)
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03]))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03]))
Or just use the provide baked in MiniRocketClassifier
which contains
the scaler and classifier.
mr = MiniRocketClassifier()
mr.fit(X_train, y_train)
MiniRocketClassifier()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MiniRocketClassifier()
X_test, y_test = load_arrow_head(split="test")
X_test_transform = minirocket.transform(X_test)
We can use MiniRocket with multivariate time series.
X_test_scaled_transform = scaler.transform(X_test_transform)
print(" Score =", classifier.score(X_test_scaled_transform, y_test))
print(" Score = ", mr.score(X_test, y_test))
Score = 0.8514285714285714 Score = 0.8685714285714285
Note: Input time series must be at least of length 9. Pad shorter time series
using, e.g., Padder
(aeon.transformers.collection
).
X_train, y_train = load_basic_motions(split="train")
mr = MiniRocket()
mr.fit(X_train)
X_train_transform = mr.transform(X_train)
scaler = StandardScaler(with_mean=False)
X_train_scaled_transform = scaler.fit_transform(X_train_transform)
classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10))
classifier.fit(X_train_scaled_transform, y_train)
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03]))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03]))
X_test, y_test = load_basic_motions(split="test")
X_test_transform = mr.transform(X_test)
X_test_scaled_transform = scaler.transform(X_test_transform)
classifier.score(X_test_scaled_transform, y_test)
1.0
from sklearn.linear_model import RidgeClassifierCV
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
minirocket_pipeline = make_pipeline(
MiniRocket(),
StandardScaler(with_mean=False),
RidgeClassifierCV(alphas=np.logspace(-3, 3, 10)),
)
Or just use the provide baked in MiniRocket
classifier
Note: Input time series must be at least of length 9. Pad shorter time series
using, e.g., Padder
(aeon.transformers.collection
).
X_train, y_train = load_arrow_head(split="train")
# it is necessary to pass y_train to the pipeline
# y_train is not used for the transform, but it is used by the classifier
minirocket_pipeline.fit(X_train, y_train)
Pipeline(steps=[('minirocket', MiniRocket()), ('standardscaler', StandardScaler(with_mean=False)), ('ridgeclassifiercv', RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03])))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('minirocket', MiniRocket()), ('standardscaler', StandardScaler(with_mean=False)), ('ridgeclassifiercv', RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03])))])
MiniRocket()
StandardScaler(with_mean=False)
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01, 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01, 2.15443469e+02, 1.00000000e+03]))
X_test, y_test = load_arrow_head(split="test")
minirocket_pipeline.score(X_test, y_test)
minirocket_pipeline.fit(X_train, y_train)
pred = minirocket_pipeline.predict(X_test)
You can also use MiniRocket for time series regression.
from aeon.datasets import load_covid_3month
X_train, y_train = load_covid_3month(split="train")
X_test, y_test = load_covid_3month(split="test")
mr = MiniRocketRegressor()
mr.fit(X_train, y_train)
mr.score(X_test, y_test)
0.1619927701771796
[1] Angus Dempster, Daniel F. Schmidt, Geoffrey I. Webb. A Very Fast (Almost) Deterministic Transform for Time Series Classification arXiv:2012.08791