Benchmarking¶

0. Setup the logging¶

This step sets up logging in our environment to increase our visibility over the steps that Draco performs.

In [1]:

import logging;

logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(level=logging.ERROR)
logging.getLogger('draco').setLevel(level=logging.INFO)

import warnings
warnings.simplefilter("ignore")

Running the Benchmarking¶

The user API for the Draco Benchmarking is the draco.benchmark.evaluate_templates function.

The evaluate_templates function accepts the following arguments:

templates (list): List of templates to try.
window_size_rule (list): List of tupples (int, str or Timedelta object).
metric (function or str): Metric to use. If an str is give it must be one of the metrics defined in the draco.metrics.METRICS dictionary.
tuning_iterations (int): Number of iterations to be used.
init_params (dict): Initialization parameters for the pipelines.
target_times (DataFrame): Contains the specefication problem that we are solving, which has three columns:
- turbine_id: Unique identifier of the turbine which this label corresponds to.
- cutoff_time: Time associated with this target.
- target: The value that we want to predict. This can either be a numerical value or a categorical label. This column can also be skipped when preparing data that will be used only to make predictions and not to fit any pipeline.
readings (DataFrame): Contains the signal data from different sensors, with the following columns:
- turbine_id: Unique identifier of the turbine which this reading comes from.
- signal_id: Unique identifier of the signal which this reading comes from.
- timestamp (datetime): Time where the reading took place, as a datetime.
- value (float): Numeric value of this reading.
preprocessing (int, list or dict): Number of preprocessing steps to be used.
cost (bool): Wheter the metric is a cost function (the lower the better) or not.
test_size (float): Percentage of the data set to be used for the test.
cv_splits (int): Amount of splits to create.
random_state (int): Random number of train_test split.
output_path (str): Path where to save the benchmark report.
cache_path (str): If given, cache the generated cross validation splits in this folder. Defatuls to None.

In [2]:

templates = [
    'lstm_prob_with_unstack',
    'double_lstm_prob_with_unstack'
]
window_size_rule = [('1d', '1h'), ('2d', '2h')]
init_params = {
    'lstm_prob_with_unstack': {
        'keras.Sequential.LSTMTimeSeriesClassifier#1': {
            'epochs': 1,
        }
    },
    'double_lstm_prob_with_unstack': {
        'keras.Sequential.DoubleLSTMTimeSeriesClassifier#1': {
            'epochs': 1,
        }
    }
}

In [3]:

from draco.benchmark import evaluate_templates

results = evaluate_templates(
    templates=templates,
    window_size_rule=window_size_rule,
    init_params=init_params,
    tuning_iterations=3,
    cv_splits=3,
)

INFO:draco.benchmark:Evaluating template lstm_prob_with_unstack on problem None (1d, 1h)
2023-04-07 14:33:33.017625: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2023-04-07 14:33:33.043631: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc3e937a8e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-07 14:33:33.043643: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
INFO:draco.pipeline:New configuration found:
  Template: lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): mean
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'lstm_1_units'): 80
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.3
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dense_1_units'): 80
INFO:draco.benchmark:Evaluating template lstm_prob_with_unstack on problem None (2d, 2h)
INFO:draco.pipeline:New configuration found:
  Template: lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): mean
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'lstm_1_units'): 80
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.3
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dense_1_units'): 80
INFO:draco.pipeline:New configuration found:
  Template: lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): median
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'lstm_1_units'): 137
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.612475373625103
      ('keras.Sequential.LSTMTimeSeriesClassifier#1', 'dense_1_units'): 191
INFO:draco.benchmark:Evaluating template double_lstm_prob_with_unstack on problem None (1d, 1h)
INFO:draco.pipeline:New configuration found:
  Template: double_lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): mean
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_1_units'): 80
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.3
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_2_units'): 80
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_2_rate'): 0.3
INFO:draco.pipeline:New configuration found:
  Template: double_lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): constant
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_1_units'): 245
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.4308586778212253
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_2_units'): 221
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_2_rate'): 0.5926391753395145
INFO:draco.benchmark:Evaluating template double_lstm_prob_with_unstack on problem None (2d, 2h)
INFO:draco.pipeline:New configuration found:
  Template: double_lstm_prob_with_unstack 
    Hyperparameters: 
      ('sklearn.impute.SimpleImputer#1', 'strategy'): mean
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_1_units'): 80
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_1_rate'): 0.3
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'lstm_2_units'): 80
      ('keras.Sequential.DoubleLSTMTimeSeriesClassifier#1', 'dropout_2_rate'): 0.3

In [4]:

results

Out[4]:

	problem_name	window_size	resample_rule	template	default_test	default_cv	tuned_cv	tuned_test	tuning_metric	tuning_metric_kwargs	fit_predict_time	default_cv_time	average_cv_time	total_time	status	accuracy_threshold/0.5	f1_threshold/0.5	fpr_threshold/0.5
0	None	1d	1h	lstm_prob_with_unstack	0.494505	0.589905	0.589905	0.322650	roc_auc_score	{'threshold': 0.5}	0 days 00:00:03.873157	0 days 00:00:14.369536	0 days 00:00:08.178422	0 days 00:00:47.144655	OK	0.280899	0.255814	1.0
1	None	2d	2h	lstm_prob_with_unstack	0.446581	0.543056	0.561570	0.707875	roc_auc_score	{'threshold': 0.5}	0 days 00:00:03.460467	0 days 00:00:12.121905	0 days 00:00:08.275919	0 days 00:00:44.449291	OK	0.730337	0.586207	1.0
2	None	1d	1h	double_lstm_prob_with_unstack	0.813187	0.307993	0.592696	0.417582	roc_auc_score	{'threshold': 0.5}	0 days 00:00:05.460985	0 days 00:00:18.103660	0 days 00:00:14.011877	0 days 00:01:11.192546	OK	0.303371	0.367347	1.0
3	None	2d	2h	double_lstm_prob_with_unstack	0.245726	0.663919	0.663919	0.293346	roc_auc_score	{'threshold': 0.5}	0 days 00:00:05.568835	0 days 00:00:17.948361	0 days 00:00:14.003816	0 days 00:01:11.051792	OK	0.303371	0.184211	1.0