Notebook

KDD2024 Tutorial / A Hands-On Introduction to Time Series Classification and Regression

Classifier and regressor capabilities¶

Univariate Classification¶

To finish off, we build all classifiers and regressors on our example EEG data and compare accuracy. We do not go into depth into the relative performance, because this these are very small toy datasets. We start by listing classifiers by their capabilities

In [ ]:

!pip install aeon==0.11.0
!mkdir -p data
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_MTSC_TRAIN.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_MTSC_TEST.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_UTSC_TRAIN.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_UTSC_TEST.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_MTSER_TRAIN.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_MTSER_TEST.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_UTSER_TRAIN.ts -P data/
!wget -nc https://raw.githubusercontent.com/aeon-tutorials/KDD-2024/main/Notebooks/data/KDD_UTSER_TEST.ts -P data/

In [1]:

# There are some deprecation warnings present in the notebook, we will ignore them.
# Remove this cell if you are interested in finding out what is changing soon, for
# aeon there will be big changes in out v1.0.0 release!
import warnings

warnings.filterwarnings("ignore")

In [2]:

from aeon.registry import all_estimators

uni_cls=all_estimators("classifier", filter_tags={"capability:multivariate": False},
               as_dataframe=True)
print("Univariate series only classifiers\n",uni_cls.iloc[:,0])
uni_reg=all_estimators("regressor", filter_tags={"capability:multivariate": False},
               as_dataframe=True)
print("Univariate only regressors\n",uni_reg.iloc[:,0])

multi_cls=all_estimators("classifier", filter_tags={"capability:multivariate": True},
               as_dataframe=True)
print("Classifiers that can handle multivariate\n",multi_cls.iloc[:,0])
multi_reg=all_estimators("regressor", filter_tags={"capability:multivariate": True},
               as_dataframe=True)
print("Regressors that can handle multivariate\n",multi_reg.iloc[:,0])

Univariate series only classifiers
 0                   BOSSEnsemble
1             ClassifierPipeline
2               ContractableBOSS
3                     HIVECOTEV1
4                 IndividualBOSS
5                MrSQMClassifier
6                ProximityForest
7                  ProximityTree
8                RSASTClassifier
9                 SASTClassifier
10                        WEASEL
11                     WEASEL_V2
12    WeightedEnsembleClassifier
Name: name, dtype: object
Univariate only regressors
 0    RegressorPipeline
Name: name, dtype: object
Classifiers that can handle multivariate
 0                                      Arsenal
1                                CNNClassifier
2            CanonicalIntervalForestClassifier
3                            Catch22Classifier
4                    ChannelEnsembleClassifier
5                              DrCIFClassifier
6                              DummyClassifier
7                              ElasticEnsemble
8                            EncoderClassifier
9                                FCNClassifier
10                       FreshPRINCEClassifier
11                                  HIVECOTEV2
12                             HydraClassifier
13                     InceptionTimeClassifier
14               IndividualInceptionClassifier
15                    IndividualLITEClassifier
16                        IndividualOrdinalTDE
17                               IndividualTDE
18                    IntervalForestClassifier
19              KNeighborsTimeSeriesClassifier
20                          LITETimeClassifier
21                  LearningShapeletClassifier
22                               MLPClassifier
23                                        MUSE
24                  MultiRocketHydraClassifier
25                                  OrdinalTDE
26                             QUANTClassifier
27                              RDSTClassifier
28                                   REDCOMETS
29                              RISTClassifier
30                                       RSTSF
31                    RandomIntervalClassifier
32    RandomIntervalSpectralEnsembleClassifier
33                            ResNetClassifier
34                            RocketClassifier
35                 ShapeletTransformClassifier
36                         SignatureClassifier
37                           SummaryClassifier
38                SupervisedIntervalClassifier
39                  SupervisedTimeSeriesForest
40                           TSFreshClassifier
41                            TapNetClassifier
42                  TemporalDictionaryEnsemble
43                           TimeCNNClassifier
44                  TimeSeriesForestClassifier
Name: name, dtype: object
Regressors that can handle multivariate
 0                                CNNRegressor
1            CanonicalIntervalForestRegressor
2                            Catch22Regressor
3                              DrCIFRegressor
4                              DummyRegressor
5                            EncoderRegressor
6                                FCNRegressor
7                        FreshPRINCERegressor
8                              HydraRegressor
9                      InceptionTimeRegressor
10               IndividualInceptionRegressor
11                    IndividualLITERegressor
12                    IntervalForestRegressor
13              KNeighborsTimeSeriesRegressor
14                          LITETimeRegressor
15                               MLPRegressor
16                  MultiRocketHydraRegressor
17                              RDSTRegressor
18                              RISTRegressor
19                    RandomIntervalRegressor
20    RandomIntervalSpectralEnsembleRegressor
21                            ResNetRegressor
22                            RocketRegressor
23                           SummaryRegressor
24                           TSFreshRegressor
25                            TapNetRegressor
26                           TimeCNNRegressor
27                  TimeSeriesForestRegressor
Name: name, dtype: object

Unequal length series¶

Currently few classifiers and regessors support unequal length series and none internally handle missing values. This will change soon. Until then, we advise using padding or truncation

In [3]:

uneq_cls=all_estimators("classifier", filter_tags={"capability:unequal_length": True},
               as_dataframe=True)
print("Classifiers that can handle unequal length series\n",uneq_cls.iloc[:,0])
uneq_reg=all_estimators("regressor", filter_tags={"capability:unequal_length": True},
               as_dataframe=True)
print("Regressors that can handle unequal length series\n",uneq_reg.iloc[:,0])

Classifiers that can handle unequal length series
 0                 Catch22Classifier
1                   DummyClassifier
2                   ElasticEnsemble
3    KNeighborsTimeSeriesClassifier
4                    RDSTClassifier
Name: name, dtype: object
Regressors that can handle unequal length series
 0                 Catch22Regressor
1                   DummyRegressor
2    KNeighborsTimeSeriesRegressor
3                    RDSTRegressor
Name: name, dtype: object

Performance on EEG data¶

We can create, fit and predict with these list of classifiers. We will use the EEG data made for this tutorial. Do not interpret much with regard to relative performance, this is for illustrative purposes only. However, the variance in results does suggest that the classifiers work differently. We exclude estimators that require arguments in the constructor such as Pipelines.

In [4]:

from aeon.datasets import load_from_tsfile

X_train_c, y_train_c = load_from_tsfile("./data/KDD_UTSC_TRAIN.ts")
X_test_c, y_test_c = load_from_tsfile("./data/KDD_UTSC_TEST.ts")

for _, c in uni_cls.iterrows():
    if c[0] not in ["ClassifierPipeline","MrSQMClassifier","WeightedEnsembleClassifier"]:
        clf = c[1]()
        clf.fit(X_train_c, y_train_c)
        print(c[0]," accuracy = ", clf.score(X_test_c, y_test_c))
        print()

BOSSEnsemble  accuracy =  0.6

ContractableBOSS  accuracy =  0.6

HIVECOTEV1  accuracy =  0.825

IndividualBOSS  accuracy =  0.525

ProximityForest  accuracy =  0.55

ProximityTree  accuracy =  0.6

RSASTClassifier  accuracy =  0.6

SASTClassifier  accuracy =  0.625

WEASEL  accuracy =  0.725

WEASEL_V2  accuracy =  0.6

Multivariate classifiers on the univariate data¶

We can use multivariate classifiers on univariate data (except for MUSE). Some are excluded from this example because they require constructor arguments, are very slow especially on CPU, require non standard imports or generate many warnings on this data. This cell will take a while to execute.

In [5]:

excl = ["MUSE", "EncoderClassifier", "ChannelEnsembleClassifier","FCNClassifier",
        "HIVECOTEV2", "IntervalForestClassifier", "LearningShapeletClassifier",
        "InceptionTimeClassifier","IndividualInceptionClassifier",
        "IndividualOrdinalTDE","OrdinalTDE","TapNetClassifier",
        "SignatureClassifier","ResNetClassifier","LITETimeClassifier",
        "IndividualLITETimeClassifier", "MLPClassifier","CNNClassifier",
        "SupervisedIntervalClassifier","REDCOMETS", "ElasticEnsemble"]

for _, c in multi_cls.iterrows():
    if c[0] not in excl:
        clf = c[1]()
        clf.fit(X_train_c, y_train_c)
        print(c[0]," accuracy = ", clf.score(X_test_c, y_test_c))
        print()

Arsenal  accuracy =  0.425

CanonicalIntervalForestClassifier  accuracy =  0.8

Catch22Classifier  accuracy =  0.8

DrCIFClassifier  accuracy =  0.75

DummyClassifier  accuracy =  0.5

FreshPRINCEClassifier  accuracy =  0.8

HydraClassifier  accuracy =  0.775

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 207ms/step
IndividualLITEClassifier  accuracy =  0.875

IndividualTDE  accuracy =  0.625

KNeighborsTimeSeriesClassifier  accuracy =  0.55

MultiRocketHydraClassifier  accuracy =  0.825

QUANTClassifier  accuracy =  0.775

RDSTClassifier  accuracy =  0.9

RISTClassifier  accuracy =  0.85

RSTSF  accuracy =  0.8

RandomIntervalClassifier  accuracy =  0.75

RandomIntervalSpectralEnsembleClassifier  accuracy =  0.9

RocketClassifier  accuracy =  0.425

ShapeletTransformClassifier  accuracy =  0.65

SummaryClassifier  accuracy =  0.7

SupervisedTimeSeriesForest  accuracy =  0.8

TSFreshClassifier  accuracy =  0.825

TemporalDictionaryEnsemble  accuracy =  0.5

3/3 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step
TimeCNNClassifier  accuracy =  0.5

TimeSeriesForestClassifier  accuracy =  0.725

Regressors on EEG data¶

All of the regressors can handle mulivariate

In [6]:

from aeon.datasets import load_from_tsfile
from sklearn.metrics import mean_squared_error
X_train_c, y_train_c = load_from_tsfile("./data/KDD_UTSER_TRAIN.ts")
X_test_c, y_test_c = load_from_tsfile("./data/KDD_UTSER_TEST.ts")

excl = ["RegressorPipeline","CNNRegressor","FCNRegressor",
        "InceptionTimeRegressor","IndividualInceptionRegressor",
        "EncoderRegressor","ResNetRegressor","IndividualLITERegressor",
        "LITETimeRegressor", "TapNetRegressor"]

for _, c in multi_reg.iterrows():
    if c[0] not in excl:
        clf = c[1]()
        clf.fit(X_train_c, y_train_c)
        y_pred= clf.predict(X_test_c)
        print(c[0]," MSE = ", mean_squared_error(y_test_c, y_pred))
        print()

CanonicalIntervalForestRegressor  MSE =  0.8481033528977171

Catch22Regressor  MSE =  0.8885051219549379

DrCIFRegressor  MSE =  0.7587243502357529

DummyRegressor  MSE =  1.3269463963115031

FreshPRINCERegressor  MSE =  0.7817647290295615

HydraRegressor  MSE =  0.9064081505288237

IntervalForestRegressor  MSE =  0.8274072847127886

KNeighborsTimeSeriesRegressor  MSE =  1.3924300952980344

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step 
MLPRegressor  MSE =  22.353821472855486

MultiRocketHydraRegressor  MSE =  0.9717904133897317

RDSTRegressor  MSE =  0.8214062698138948

RISTRegressor  MSE =  0.7026925551976874

RandomIntervalRegressor  MSE =  0.8220888655006664

RandomIntervalSpectralEnsembleRegressor  MSE =  0.8848753973613874

RocketRegressor  MSE =  1.1483157858379491

SummaryRegressor  MSE =  0.8242166680716194

TSFreshRegressor  MSE =  0.7928987608546907

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step
TimeCNNRegressor  MSE =  1.6028747634206029

TimeSeriesForestRegressor  MSE =  0.8388172082622043

Performance on archive data¶

We can directly pull published results from our website timeseriesclassification.com. See this notebook for more details on how to do this.

In [7]:

from aeon.benchmarking import get_available_estimators
from aeon.benchmarking.results_loaders import (
    get_estimator_results_as_array,
)
from aeon.visualisation import (
    plot_critical_difference,
)

cls = get_available_estimators(task="classification", return_dataframe=False)
print(len(cls), " classifier results available\n", cls)
resamples_all, data_names = get_estimator_results_as_array(
    estimators=cls, default_only=False
)
# results are loaded from
# https://timeseriesclassification.com/results/ReferenceResults.
# You can download the files directly from there
print(resamples_all.shape)
classifiers = [
    "FreshPRINCE",
    "HIVECOTEV2",
    "InceptionTime",
    "WEASEL-D",
    "MR-Hydra",
    "RDST",
    "QUANT",
    "PF"
]
resamples_all, data_names = get_estimator_results_as_array(
    estimators=classifiers, default_only=False
)
plot = plot_critical_difference(
    resamples_all, classifiers, test="wilcoxon", correction="holm"
)

40  classifier results available
 ['1NN-DTW', 'Arsenal', 'BOSS', 'CIF', 'CNN', 'Catch22', 'DrCIF', 'EE', 'FreshPRINCE', 'GRAIL', 'H-InceptionTime', 'HC1', 'HC2', 'Hydra', 'InceptionTime', 'LiteTime', 'MR', 'MR-Hydra', 'MiniROCKET', 'MrSQM', 'PF', 'QUANT', 'R-STSF', 'RDST', 'RISE', 'RIST', 'ROCKET', 'RSF', 'ResNet', 'STC', 'STSF', 'ShapeDTW', 'Signatures', 'TDE', 'TS-CHIEF', 'TSF', 'TSFresh', 'WEASEL-1.0', 'WEASEL-2.0', 'cBOSS']
(112, 40)