This is a tutorial for anomaly detection using Orion. Orion is a python
package for time series anoamly detection. It provides a suite of both statistical and machine learning models that enable efficient anomaly detection.
In this tutorial, we will learn how to set up Orion, train a machine learning model, and perform anomaly detection. We will delve into each part seperately and then run the evaluation pipeline from beginning to end in order to compare multiple models against each other.
# general imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MinMaxScaler
from utils import plot, plot_ts, plot_rws, plot_error, unroll_ts
In part one of the series, we explore a time series data, particularly the NYC taxi data. You can find the raw data on the TLC or the processed version maintained by Numenta here. We also explore what reasons could possibily be contributing to producing such anomalies.
There is a collection of data already available in Orion, to load them, we use the load_signal
function and pass the name of the signal we wish to obtain. Similarly, since this data is labeled, we use the load_anomalies
function to get the corresponding anomaly of the signal
from orion.data import load_signal, load_anomalies
signal = 'nyc_taxi'
# load signal
df = load_signal(signal)
# load ground truth anomalies
known_anomalies = load_anomalies(signal)
df.head(5)
timestamp | value | |
---|---|---|
0 | 1404165600 | 10844.0 |
1 | 1404167400 | 8127.0 |
2 | 1404169200 | 6210.0 |
3 | 1404171000 | 4656.0 |
4 | 1404172800 | 3820.0 |
plot(df, known_anomalies)
/Users/sarah/Downloads/repos-to-trash/Orion/tutorials/tulog/utils.py:145: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(ylabels)
In part two of the series, we look at anomaly detection through time series reconstruction, particularly using a GAN model. We go through a sequence of transformations and data preparation, as well as model training and prediction.
We will use Orion
to perform these sequence of actions. We will be emphasizing the usage of the TadGAN
model which is a time series anomaly detection using GANs model. The model is specified in a json
format accompanied with this notebook named tadgan.json
. There are more pipelines defined within the repository including: ARIMA, LSTM, etc.
The Orion API is a simple interface that allows you to interact with anomaly detection pipeline. To train the model on the data, we simply use the fit
method; to do anomaly detection, we use the detect
method. In our case, we want to fit the data and then perform detection; therefore we use the fit_detect
method. This might take some time to run. Once it’s done, we can visualize the results.
Note: the model might take some time to train. For experimentation purposes, you can reduce the number of epochs
in the tadgan.json
file such that you reduce the number of training iterations.
from orion import Orion
orion = Orion(
pipeline='tadgan.json'
)
anomalies = orion.fit_detect(df)
/Users/sarah/opt/anaconda3/envs/orion-tf2/lib/python3.8/site-packages/sklearn/impute/_base.py:356: FutureWarning: The 'verbose' parameter was deprecated in version 1.1 and will be removed in 1.3. A warning will always be raised upon the removal of empty columns in the future version. warnings.warn( 2022-09-16 17:23:12.866379: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-16 17:23:12.904780: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8ba19c7510 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2022-09-16 17:23:12.904806: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Epoch: 1/5, Losses: {'cx_loss': -0.8796, 'cz_loss': -2.0101, 'eg_loss': 4.329} Epoch: 2/5, Losses: {'cx_loss': -1.6189, 'cz_loss': 0.2361, 'eg_loss': -2.1686} Epoch: 3/5, Losses: {'cx_loss': -1.1455, 'cz_loss': 0.1359, 'eg_loss': -2.5223} Epoch: 4/5, Losses: {'cx_loss': -1.0701, 'cz_loss': 0.5063, 'eg_loss': -3.8923} Epoch: 5/5, Losses: {'cx_loss': -0.6953, 'cz_loss': 3.0434, 'eg_loss': -8.6544}
Let's visualize the results.
plot(df, [anomalies, known_anomalies])
anomalies.head(5)
/Users/sarah/Downloads/repos-to-trash/Orion/tutorials/tulog/utils.py:145: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(ylabels)
start | end | severity | |
---|---|---|---|
0 | 1404165600 | 1404372600 | 0.984310 |
1 | 1422097200 | 1422496800 | 0.150504 |
The red intervals depict the detected anomalies, the green intervals show the ground truth. Cool! the model was able to detect some anomalies. We also see that it detected some other intervals that were not included in the ground truth labels. It is clear though, they are falling out of shape with respect to the remaining signal. Note: the results might differ between runs.
We might have jumped straight to the results but let's trace back and look at what the model actually did.
There is a series of transformations happening to the data in order to obtain the result you have just seen. From data preprocessing, model training, to post-processing functionalities. We specify these functions, which we refer to as primitives, within the model’s .json
file. What are these primitives? If we were to look at the tadgan.json
model, we find these sequential primitives:
"primitives": [
"mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate”,
"sklearn.impute.SimpleImputer",
"sklearn.preprocessing.MinMaxScaler",
"mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences",
"orion.primitives.tadgan.TadGAN",
"orion.primitives.tadgan.score_anomalies",
"orion.primitives.timeseries_anomalies.find_anomalies"
]
Each primitive is responsible for a single task. We describe the procedure of each primitive in the remainder of this notebook
Adjust signal spacing to be of equal width across all times. There are two important parameters in this process:
In addition to, passing the array of values and which column holds the values we wish to alter the frequency of.
def time_segments_aggregate(X, interval, time_column, method=['mean']):
"""Aggregate values over given time span.
Args:
X (ndarray or pandas.DataFrame):
N-dimensional sequence of values.
interval (int):
Integer denoting time span to compute aggregation of.
time_column (int):
Column of X that contains time values.
method (str or list):
Optional. String describing aggregation method or list of strings describing multiple
aggregation methods. If not given, `mean` is used.
Returns:
ndarray, ndarray:
* Sequence of aggregated values, one column for each aggregation method.
* Sequence of index values (first index of each aggregated segment).
"""
if isinstance(X, np.ndarray):
X = pd.DataFrame(X)
X = X.sort_values(time_column).set_index(time_column)
if isinstance(method, str):
method = [method]
start_ts = X.index.values[0]
max_ts = X.index.values[-1]
values = list()
index = list()
while start_ts <= max_ts:
end_ts = start_ts + interval
subset = X.loc[start_ts:end_ts - 1]
aggregated = [
getattr(subset, agg)(skipna=True).values
for agg in method
]
values.append(np.concatenate(aggregated))
index.append(start_ts)
start_ts = end_ts
return np.asarray(values), np.asarray(index)
X, index = time_segments_aggregate(df, interval=1800, time_column='timestamp')
If we go back to the source of the NYC Taxi data, we find that it records a value each 30 minutes. In the timestamp world, this is equivalent to 1800 seconds, therefore we set the interval to be 1800
. We also opt for the default aggregation method which is taking the mean
value of each interval.
Technically speaking, in our example the data is perfectly spaced, so we can skip this preprocessing step. However, that is not always the case and so we include it as a preprocessing primitive in the general pipeline as you will see later on.
impute missing values that appear within the signal using scikit-learn's SimpleImputer
which fills missing values by the mean value.
imp = SimpleImputer()
X = imp.fit_transform(X)
normalize the data between a specific range, we use scikit-learn's MinMaxScaler
to scale data between [-1, 1].
scaler = MinMaxScaler(feature_range=(-1, 1))
X = scaler.fit_transform(X)
Notice how the y-axis changed after normalizing the data between [-1, 1]
plot_ts(X)
to prepare the data, we need to transform it into a sequence that is ingestable by the machine learning model. We take the signal we're interested in analyzing and we generate training examples. These training examples are mere snapshots of signal at different times.
In order to do that, we adopt the sliding window approach of choosing a window of a pre-specified width and a particular step size. Once that's been decided we divide the signal indo segments, similar to what is depicited in the illustration below.
We create a rolling_window_sequence
function that slices the data into parts, each part contains:
def rolling_window_sequences(X, index, window_size, target_size, step_size, target_column,
drop=None, drop_windows=False):
"""Create rolling window sequences out of time series data.
The function creates an array of input sequences and an array of target sequences by rolling
over the input sequence with a specified window.
Optionally, certain values can be dropped from the sequences.
Args:
X (ndarray):
N-dimensional sequence to iterate over.
index (ndarray):
Array containing the index values of X.
window_size (int):
Length of the input sequences.
target_size (int):
Length of the target sequences.
step_size (int):
Indicating the number of steps to move the window forward each round.
target_column (int):
Indicating which column of X is the target.
drop (ndarray or None or str or float or bool):
Optional. Array of boolean values indicating which values of X are invalid, or value
indicating which value should be dropped. If not given, `None` is used.
drop_windows (bool):
Optional. Indicates whether the dropping functionality should be enabled. If not
given, `False` is used.
Returns:
ndarray, ndarray, ndarray, ndarray:
* input sequences.
* target sequences.
* first index value of each input sequence.
* first index value of each target sequence.
"""
out_X = list()
out_y = list()
X_index = list()
y_index = list()
target = X[:, target_column]
if drop_windows:
if hasattr(drop, '__len__') and (not isinstance(drop, str)):
if len(drop) != len(X):
raise Exception('Arrays `drop` and `X` must be of the same length.')
else:
if isinstance(drop, float) and np.isnan(drop):
drop = np.isnan(X)
else:
drop = X == drop
start = 0
max_start = len(X) - window_size - target_size + 1
while start < max_start:
end = start + window_size
if drop_windows:
drop_window = drop[start:end + target_size]
to_drop = np.where(drop_window)[0]
if to_drop.size:
start += to_drop[-1] + 1
continue
out_X.append(X[start:end])
out_y.append(target[end:end + target_size])
X_index.append(index[start])
y_index.append(index[end])
start = start + step_size
return np.asarray(out_X), np.asarray(out_y), np.asarray(X_index), np.asarray(y_index)
X, y, X_index, y_index = rolling_window_sequences(X, index,
window_size=100,
target_size=1,
step_size=1,
target_column=0)
print("Training data input shape: {}".format(X.shape))
print("Training data index shape: {}".format(X_index.shape))
print("Training y shape: {}".format(y.shape))
print("Training y index shape: {}".format(y_index.shape))
Training data input shape: (10222, 100, 1) Training data index shape: (10222,) Training y shape: (10222, 1) Training y index shape: (10222,)
plot_rws(X)
Where X
represents the input used to train the model. In the previous example, we see X
has 10222
training data points. Notice that 100
represents the window size. On the other hand, y
is the real signal after processing, which we will use later on to calculate the error between the reconstructed and real signal.
The architecture of the model requires four neural networks:
encoder
: maps X
to its latent representation Z
.generator
: maps the latent variable Z
back to X
, which we will denote later on as X_hat
.criticX
: discriminates between X
and generator(Z)
or X_hat
.criticZ
: discriminates between Z
and encoder(X)
.we detail the composition of each network in model.py
.
To use the TadGAN
model, we specify a number of parameters including the model layers (structure of the previously mentioned neural networks). We also specify the input dimensions, the number of epochs, the learning rate, etc. All the parameters are listed below.
from model import hyperparameters
from orion.primitives.tadgan import TadGAN
hyperparameters["epochs"] = 5
hyperparameters["input_shape"] = (100, 1) # based on the window size
hyperparameters["optimizer"] = "keras.optimizers.Adam"
hyperparameters["learning_rate"] = 0.0005
hyperparameters["latent_dim"] = 20
hyperparameters["batch_size"] = 64
tgan = TadGAN(**hyperparameters)
tgan.fit(X)
Epoch: 1/5, Losses: {'cx_loss': array([-1.1872, -4.4056, 2.5281, 0.069 ]), 'cz_loss': array([-2.5995, -1.6567, -2.3138, 0.1371]), 'eg_loss': array([ 2.1651, -2.5728, 3.1177, 0.162 ])} Epoch: 2/5, Losses: {'cx_loss': array([ -1.2376, -12.4034, 10.9903, 0.0175]), 'cz_loss': array([-2.1497, -3.525 , 1.0386, 0.0337]), 'eg_loss': array([-10.828 , -11.0536, -0.9568, 0.1182])} Epoch: 3/5, Losses: {'cx_loss': array([-0.8272, -9.2085, 8.2797, 0.0102]), 'cz_loss': array([-2.3846, -4.2531, 1.6279, 0.0241]), 'eg_loss': array([-9.0235, -8.2781, -1.5691, 0.0824])} Epoch: 4/5, Losses: {'cx_loss': array([-0.5854, -9.1109, 8.4216, 0.0104]), 'cz_loss': array([-2.6476, -3.9248, 1.005 , 0.0272]), 'eg_loss': array([-8.5022, -8.2868, -0.9191, 0.0704])} Epoch: 5/5, Losses: {'cx_loss': array([-4.8460e-01, -8.2309e+00, 7.6643e+00, 8.2000e-03]), 'cz_loss': array([-2.4598, -3.8494, 1.144 , 0.0246]), 'eg_loss': array([-8.1353, -7.6542, -1.0893, 0.0608])}
# reconstruct
X_hat, critic = tgan.predict(X)
# visualize X_hat
plot_rws(X_hat)
To reassemble or “unroll” the predicted signal X_hat
we can choose different aggregation methods (e.g., mean, max, etc). In our implementation, we chose it to as the median value.
# flatten the predicted windows
y_hat = unroll_ts(X_hat)
# plot the time series
plot_ts([y, y_hat], labels=['original', 'reconstructed'])
We can see that the GAN model did really well in trying to reconstruct the signal. We also see how it expected the signal to be, in comparison to what it actually is. The discrepancies between the two signals will be used to calculate the error. The higher the error, the more likely it is an anomaly
# pair-wise error calculation
error = np.zeros(shape=y.shape)
length = y.shape[0]
for i in range(length):
error[i] = abs(y_hat[i] - y[i])
# visualize the error curve
fig = plt.figure(figsize=(30, 3))
plt.plot(error)
plt.show()
In the TadGAN
pipeline, we use tadgan.score_anomalies
to perform error calculation for us. It is a smoothed error function that uses a window based method to smooth the curve then uses either: area, point difference, or dtw as a measure of discrepancy.
This method captures the general shape of the orignal and reconstructed signal and then compares them together.
This method applies a point-to-point comparison between the original and reconstructed signal. It is considered a strict approach that does not allow for many mistakes.
A more lenient method yet very effective is Dynamic Time Warping (DTW). It compares two signals together using any pair-wise distance measure but it allows for one signal to be lagging behind another.
from orion.primitives.tadgan import score_anomalies
error, true_index, true, pred = score_anomalies(X, X_hat, critic, X_index, rec_error_type="dtw", comb="mult")
pred = np.array(pred).mean(axis=2)
# visualize the error curve
plot_error([[true, pred], error])
Now we can visually see where the error reaches a substantially high value. But how should we decide if the error value determines a potential anomaly? We could use a fixed threshold that says if error > 10
then let’s classify the datapoint as anomalous.
# threshold
thresh = 10
intervals = list()
i = 0
max_start = len(error)
while i < max_start:
j = i
start = index[i]
while i < len(error) and error[i] > thresh:
i += 1
end = index[i]
if start != end:
intervals.append((start, end, np.mean(error[j: i+1])))
i += 1
intervals
[(1404541800, 1404592200, 10.302447289165059), (1404621000, 1404631800, 10.043972688121197), (1419429600, 1419652800, 18.762294512542237), (1422221400, 1422451800, 31.012344987856537)]
anomalies = pd.DataFrame(intervals, columns=['start', 'end', 'score'])
plot(df, [anomalies, known_anomalies])
/Users/sarah/Downloads/repos-to-trash/Orion/tutorials/tulog/utils.py:145: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(ylabels)
While a fixed threshold raised some correct anomalies, it missed out on others. If we were to look back at the error plot, we notice that some deviations are abnormal within its local region. So how can we incorporate this information in our thresholding technique? We can use window based methods to detect anomalies with respect to their context.
We first define the window of errors, that we want to analyze. We then find the anomalous sequences in that window by looking at the mean and standard deviation of the errors in the window. We store the start/stop index pairs that correspond to each sequence, along with its score. We then move the window and repeat the procedure. Lastly, we combine overlapping or consecutive sequences.
from orion.primitives.timeseries_anomalies import find_anomalies
# find anomalies
intervals = find_anomalies(error, index,
window_size_portion=0.33,
window_step_size_portion=0.1,
fixed_threshold=True)
intervals
array([[1.40441940e+09, 1.40474700e+09, 4.80894965e-01], [1.40943600e+09, 1.40968620e+09, 3.01127507e-01], [1.41471900e+09, 1.41497820e+09, 2.81347902e-01], [1.41697800e+09, 1.41726420e+09, 6.90210631e-01], [1.41933960e+09, 1.41969600e+09, 1.51006912e+00], [1.42218720e+09, 1.42248420e+09, 1.56241660e+00]])
# visualize the result
anomalies = pd.DataFrame(intervals, columns=['start', 'end', 'score'])
plot(df, [anomalies, known_anomalies])
/Users/sarah/Downloads/repos-to-trash/Orion/tutorials/tulog/utils.py:145: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(ylabels)
Cool! We now have the same result as we saw previously. The red intervals depict the detected anomalies, the green intervals show the ground truth. We also see that it detected some other intervals that were not included in the ground truth labels.
Using the Orion API and pipelines, we simplified this process yet allowed flexibility for pipeline configuration.
To configure a pipeline, we adjust the parameters of the primitive of interest within the pipeline.json
file or directly by passing the dictionary to the API.
In the following example, I changed the aggregation level as well as the number of epochs
for training. These changes will override the parameters specified in the .json
file. To know more about the API usage and primitive designs, please refer to the documentation.
from orion import Orion
hyperparameters = {
"mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"interval": 3600 # hour level
},
'orion.primitives.tadgan.TadGAN#1': {
'epochs': 5,
}
}
orion = Orion(
'tadgan.json',
hyperparameters
)
anomalies = orion.fit_detect(df)
/Users/sarah/opt/anaconda3/envs/orion-tf2/lib/python3.8/site-packages/sklearn/impute/_base.py:356: FutureWarning: The 'verbose' parameter was deprecated in version 1.1 and will be removed in 1.3. A warning will always be raised upon the removal of empty columns in the future version. warnings.warn(
Epoch: 1/5, Losses: {'cx_loss': -0.7711, 'cz_loss': 4.9931, 'eg_loss': -2.9485} Epoch: 2/5, Losses: {'cx_loss': -3.3203, 'cz_loss': -27.3137, 'eg_loss': 37.1111} Epoch: 3/5, Losses: {'cx_loss': -6.424, 'cz_loss': -12.0811, 'eg_loss': 7.5079} Epoch: 4/5, Losses: {'cx_loss': -9.01, 'cz_loss': 1.2478, 'eg_loss': -39.4212} Epoch: 5/5, Losses: {'cx_loss': -2.7106, 'cz_loss': 2.0483, 'eg_loss': -52.451}
plot(df, [anomalies, known_anomalies])
/Users/sarah/Downloads/repos-to-trash/Orion/tutorials/tulog/utils.py:145: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(ylabels)
The anomalies detected in this run are a bit different from the earlier example. Although, it still succeeded in detecting anomalies. Maybe a 1 hour aggregate is not the appropriate value? Maybe we did not train the model enough times, or maybe too many times... How can we tell? One way is to look at the result of the model like we have done previously.
You can use the visualization
parameter within detect
to return intermediate outputs (primitive outputs) that we are interested in. For example in the tadgan.json
file, use visualization
to return the following variables:
X
: this is the output of the preprocessing steps from averaging, imputing, and scaling. These steps were showcased previously as steps (A, B, and C).X_hat
: this is the "predicted" output by the TadGAN model without any processing. It represents the reconstructed window at each timepoint.es
: this is the error calculated by capturing the discrepancies between original and reconstructed signal.then we use anomalies, viz = orion.detect(df, visualization=True)
where viz
will be a dictionary of intermediate outputs.
Note: we will talk more about how to evaluate the detected anommalies with respect to the ground truth in part 3 of the tutorial.
In part three of the series, we look at evaluation and end-to-end anomaly detection evaluation.
We compare the anomalies given to us as ground truth labels to the detected anomalies. But first, we look at some of the mechanisms we have for evaluation, namely:
We will look at both approaches, but first let's construct a dummy dataset.
Let's assume that the signal starts at timestamp 1, and ends at timestamp 20. We can then see that the ground truth contains three anomalies, namely (5, 8)
, (12, 13)
, and (17, 18)
, where (i, j)
expresses the starting timestamp i
and ending timestamp j
.
We can also see that, we detected two anomalies, namely (5, 8)
and (12, 15)
. So how can we compare both sets?
import numpy as np
# to reproduce the same dummy signal
np.random.seed(0)
# dummy data
start, end = (1, 20)
signal = np.random.rand(end - start, 1)
ground_truth = [
(5, 8),
(12, 13),
(17, 18)
]
anomalies = [
(5, 8),
(12, 15)
]
import matplotlib.pyplot as plt
time = range(start, end)
plt.plot(time, signal)
# ground truth
for i, (t1, t2) in enumerate(ground_truth):
plt.axvspan(t1, t2+1, color="g", alpha=0.2, label="ground_truth")
# detected
for i, (t1, t2) in enumerate(anomalies):
plt.axvspan(t1, t2+1, color="r", alpha=0.2, label="detected")
plt.title("Example")
plt.xlabel("Time")
plt.ylabel("value")
plt.show()
There are two approaches for comparing anomaly sets, as expressed earlier.
(1) Weighted Segment, a stricter method, it is valuable to use when you want to equalize the importance of detecting anomalies, and normal instances.
Visually, this operation is summarized by the illustration below.
we can use orion.evaluation
subpackage to compute multiple metrics using the weighted segment approach. For example to compute the accuracy, we use contextual_accuracy(..., weighted=True)
. There are other metrics available, for reference checkout the orion.evaluation
documentation.
from orion.evaluation.contextual import contextual_accuracy, contextual_f1_score
accuracy = contextual_accuracy(ground_truth, anomalies, start=start, end=end)
f1_score = contextual_f1_score(ground_truth, anomalies, start=start, end=end)
print("Accuracy score = {:0.3f}".format(accuracy))
print("F1 score = {:0.3f}".format(f1_score))
Accuracy score = 0.789 F1 score = 0.750
(2) Overlap Segment, a more lenient approach of evaluation. It takes the perspective of rewarding the system if it manages to alarm the user of a subset of an anomaly. More particularly, it records:
TP, if a ground truth segment overlaps with the detected segment.
FN, If the ground truth segment does not overlap any detected segments.
FP, If a detected segment does not overlap any labeled anomalous region.
This can be summarized by the illustration below.
Similarly, we can use the same metric functions, but this time we use the parameter weighted=False
. Note: overlap segment approach, does not account for true negatives. Reason being, anomalies in time series data are rare and so "normal" instances will skew the value of the computed metric. Therefore, using this approach we cannot compute metrics such as the accuracy
.
f1_score = contextual_f1_score(ground_truth, anomalies, start=start, end=end, weighted=False)
print("F1 score = {:0.3f}".format(f1_score))
F1 score = 0.800
We integrate the evaluation suite into the Orion API, such that you can evaluate the pipeline on a dataset (with its labels) end-to-end.
Following part 2 introduction to the Orion API, we can create an orion
instance the use its evaluate
functionality. We support the method with the following arguments:
data
, a pandas.DataFrame
containing two columns: timestamp and value.truth
, a pandas.DataFrame
containing two columns: start timestamp and end timestamp of ground truth labelsfit
, a flag denoting whether to train the pipeline before evaluating it.train_data, a pandas.DataFrame
containing two columns: timestamp and value used to train the pipeline, if not given, the pipeline will be trained on data
.
metrics
, a list of metrics used to evaluate the pipeline.In the previous part we went through how to train a pipeline and use it for anomaly detection, the focus now is on defining metrics
and evaluating the performance of the pipeline.
metrics
is list of function names that compares a ground truth labels against detected labels and returns a metric value. We have seen some functions of that sort, such as contextual_accuracy
and contextual_f1_score
. To construct our metrics list we select some of the predefined metrics in Orion, such as:
By default, we use the weighted segment
approach, you can override metrics defined by specifying aa new metrics
dictitonary.
from orion.data import load_signal, load_anomalies
metrics = [
'f1',
'recall',
'precision',
]
signal = 'nyc_taxi'
# load signal
df = load_signal(signal)
# load ground truth anomalies
ground_truth = load_anomalies(signal)
scores = orion.evaluate(df, ground_truth, metrics=metrics)
scores
f1 0.238148 recall 0.209709 precision 0.275511 dtype: float64