Distance based classifiers use a time series specific distance function to measure the similarity between time series. Time series distance functions are often called elastic distances, since they compensate for possible misalignment between series by shifting or editing the series.
Dynamic time warping is the best known elastic distance measure. This image demonstrates how a warping path is found between two series
We have a range of elastic distance functions in the distances module. Please see the distances notebook for more information. Distance functions have been mostly used with a nearest neighbour (NN) classifier.
from sklearn import metrics
from aeon.datasets import load_italy_power_demand
from aeon.registry import all_estimators
X_train, y_train = load_italy_power_demand(split="train", return_X_y=True)
X_test, y_test = load_italy_power_demand(split="test", return_X_y=True)
X_test = X_test[:10]
y_test = y_test[:10]
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)
(67, 1, 24) (67,) (10, 1, 24) (10,)
# search for all classifiers that can handle multivariate time series. This will
# give some UserWarnings if soft dependencies are not installed. Rerun to remove
# warnings.
all_estimators("classifier", filter_tags={"algorithm_type": "distance"})
C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'pyod'. 'pyod' is a soft dependency and not included in the base aeon installation. Please run: `pip install pyod` to install the pyod package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\venv\lib\site-packages\numba\core\decorators.py:253: RuntimeWarning: nopython is set for njit and is ignored warnings.warn('nopython is set for njit and is ignored', RuntimeWarning) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'hmmlearn'. 'hmmlearn.hmm' is a soft dependency and not included in the base aeon installation. Please run: `pip install hmmlearn.hmm` to install the hmmlearn.hmm package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'esig'. 'esig' is a soft dependency and not included in the base aeon installation. Please run: `pip install esig` to install the esig package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'tslearn'. 'tslearn' is a soft dependency and not included in the base aeon installation. Please run: `pip install tslearn` to install the tslearn package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'pmdarima'. 'pmdarima' is a soft dependency and not included in the base aeon installation. Please run: `pip install pmdarima` to install the pmdarima package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'tbats'. 'tbats' is a soft dependency and not included in the base aeon installation. Please run: `pip install tbats` to install the tbats package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'prophet'. 'prophet' is a soft dependency and not included in the base aeon installation. Please run: `pip install prophet` to install the prophet package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'statsforecast'. 'statsforecast' is a soft dependency and not included in the base aeon installation. Please run: `pip install statsforecast` to install the statsforecast package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'pycatch22'. 'pycatch22' is a soft dependency and not included in the base aeon installation. Please run: `pip install pycatch22` to install the pycatch22 package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'mne'. 'mne' is a soft dependency and not included in the base aeon installation. Please run: `pip install mne` to install the mne package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'pykalman'. 'pykalman' is a soft dependency and not included in the base aeon installation. Please run: `pip install pykalman` to install the pykalman package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg) C:\Code\scikit-time\aeon\utils\validation\_dependencies.py:142: UserWarning: No module named 'filterpy'. 'filterpy' is a soft dependency and not included in the base aeon installation. Please run: `pip install filterpy` to install the filterpy package. To install all soft dependencies, run: `pip install aeon[all_extras]` warnings.warn(msg)
[('ElasticEnsemble', aeon.classification.distance_based._elastic_ensemble.ElasticEnsemble), ('KNeighborsTimeSeriesClassifier', aeon.classification.distance_based._time_series_neighbors.KNeighborsTimeSeriesClassifier), ('MatrixProfileClassifier', aeon.classification.feature_based._matrix_profile_classifier.MatrixProfileClassifier), ('ShapeDTW', aeon.classification.distance_based._shape_dtw.ShapeDTW)]
from aeon.classification.distance_based import (
ElasticEnsemble,
KNeighborsTimeSeriesClassifier,
ShapeDTW,
)
k-NN is often called a lazy classifier, because there is little work done in the fit operation. The fit operation simply stores the training data. When we want to make a prediction for a new time series, k-NN measures the distance between the new time series and all the series in the training data and records the class of the closest k train series. The class labels of these nearest neighbours are used to make a prediction: if they are all the same label, then that is the prediction. If they differ, then some form of voting mechanism is required. For example, we may predict the most common class label amongst the nearest neighbours for the test instance.
KNeighborsTimeSeriesClassifier in aeon is configurable to use any of the distances functions in the distance module, or it can be passed a bespoke callable. You can set the number of neighbours and the weights. Weights are used in the prediction process when neightbours differ in class values. By default all neighbours have an equal vote. There is an option to weight by distance, meaning closer neighbours have more weight in the vote.
knn = KNeighborsTimeSeriesClassifier(distance="msm", n_neighbors=3, weights="distance")
knn.fit(X_train, y_train)
knn_preds = knn.predict(X_test)
metrics.accuracy_score(y_test, knn_preds)
1.0
The first algorithm to significantly out perform 1-NN DTW on the UCR data was the Elastic Ensemble (EE) [1]. EE is a weighted ensemble of 11 1-NN classifiers with a range of elastic distance measures. It was the best performing distance based classifier in the bake off. Elastic distances can be slow, and EE requires cross validation to find the weights of each classifier in the ensemble. You can configure EE to use specified distance functions, and tell it how much
ee = ElasticEnsemble(
distance_measures=["dtw", "msm"],
proportion_of_param_options=0.1,
proportion_train_in_param_finding=0.3,
proportion_train_for_test=0.5,
)
ee.fit(X_train, y_train)
ee_preds = ee.predict(X_test)
metrics.accuracy_score(y_test, ee_preds)
0.9
Shape based DTW (ShapeDTW) [2] works by extracting a set of shape descriptors (such as slope and derivative) over windows of each series. These series to series transformed data are then used with 1-NN with DTW.
shape = ShapeDTW()
shape.fit(X_train, y_train)
shape_preds = shape.predict(X_test)
metrics.accuracy_score(y_test, shape_preds)
0.9
Proximity Forest [3] is a distance based ensemble of decision trees. Its is the most accurate purely distance based technique for TSC that we know of. We do not currently have a working version of PF in aeon, but would very much like to have one. please see this issue. https://github.com/aeon-toolkit/aeon/issues/159
[1] Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Mining and Knowledge Discovery 29:565–592 [2] Zhao J. and Itti L (2019) shapeDTW: Shape Dynamic Time Warping, Pattern Recognition 74:171-184 https://arxiv.org/pdf/1606.01601.pdf [3] Lucas et al. (2019) Proximity Forest: an effective and scalable distance-based classifier. Data Mining and Knowledge Discovery 33: 607--635 https://arxiv.org/abs/1808.10594