Optimization Strategies for Deep Learning with Hyperactive

In this tutorial you will learn how to automate the process of selecting the best deep learning model for a given dataset. The structure and hyperparameters of a neural network have a big impact on the performance of your predictive model. Unfortunately, it is often a tedious endeavor to search for good model-parameters yourself. A solution to this is to automate this search process. Hyperactive is a python package designed to solve this problem. Hyperactive has some unique properties that enable you to automatically explore new models or improve existing ones. Some of those are:

- Very easy to use. Only a few lines of new code to learn.
- Flexible search space that can contain python objects
- Sequential model based optimization techniques
- Hyperactive-memory "remembers" parameters to save time
- Results are processed for easy further use

You can learn more about Hyperactive on Github

Table of contents:

In [1]:
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from scipy import stats

from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import SGD
from keras.utils import np_utils
from tensorflow import keras

import tensorflow as tf

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
config.log_device_placement = True

sess = tf.compat.v1.Session(config=config)

verbose = 0

from hyperactive import Hyperactive
from hyperactive import BayesianOptimizer, EvolutionStrategyOptimizer

color_scale = px.colors.sequential.Jet
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device

Some helper functions we might want to use later. You can skip these. Lets move on to the introduction to Hyperactive.

In [2]:
def warn(*args, **kwargs):
import warnings
warnings.warn = warn

def func2str(row):
    return row.__name__

def add_categorical_score(search_data):
    score = search_data["score"]
    score_min = np.amin(score)
    score_max = np.amax(score)
    score_diff = score_max - score_min
    score_best_ = score_max - score_diff*0.25
    score_worst_ = score_min + score_diff*0.25

    def categorize_column(row):
        if row['score'] > score_best_:
            score_cat = "best 25%"
        elif row['score'] < score_worst_:
            score_cat = "worst 25%"
            score_cat = "average"
        return score_cat

    search_data["score_categorical"] = search_data.apply(categorize_column, axis=1)
    return search_data

If you are already familiar with Hyperactive you can skip this section and start with First Deep Learning Optimization run.

An introduction to Hyperactive

This is where the interesting stuff begins. The following code is a very simple example to show how an optimization run looks like:

In [3]:
# the objective function defines the "model" we want to optimize
def inverted_quadratic_function(parameters):
    # access the parameters from the search space
    score = (parameters["x"]*parameters["x"])
    # Hyperactive always maximizes the value returned by the objective function
    return -score

# the search space defines where to search for the optimal parameters
search_space = {
    "x" : list(np.arange(-10, 10, 0.01))

# Hyperactive will run the objective function n_iter-times and search for the best parameters
hyper = Hyperactive()
hyper.add_search(inverted_quadratic_function, search_space, n_iter=100)
Results: 'inverted_quadratic_function'  
   Best score: -0.02250000000006299  
   Best parameter:
      'x' : -0.15000000000020997  
   Evaluation time   : 0.0027713775634765625 sec    [43.42 %]
   Optimization time : 0.0036106109619140625 sec    [56.58 %]
   Iteration time    : 0.006381988525390625 sec    [15669.1 iter/sec]

After performing an optimization run Hyperactive can return the collected search data. From the "results"-method you get the search data in form of a pandas dataframe. Each row contains the parameter-set, score (and other information) used in the iteration. You can use this dataframe to plot the search data or do your own data exploration with it.

In [4]:
# access the results from a pandas dataframe
search_data = hyper.results(inverted_quadratic_function)
x eval_time iter_time score
0 3.59 0.000202 0.000647 -12.8881
1 6.61 0.000040 0.000056 -43.6921
2 -6.01 0.000128 0.000140 -36.1201
3 -2.02 0.000020 0.000311 -4.0804
4 1.97 0.000041 0.000461 -3.8809
... ... ... ... ...
95 1.30 0.000017 0.000028 -1.6900
96 -2.93 0.000017 0.000028 -8.5849
97 1.06 0.000017 0.000028 -1.1236
98 1.35 0.000017 0.000029 -1.8225
99 -5.01 0.000017 0.000028 -25.1001

100 rows × 4 columns

The example already shows a lot of the features of Hyperactive:

- objective functions can be anything you want:
    - a mathematical function
    - a machine-/deep-learning model (sklearn, keras, pytorch, ...)
    - a simulation environment
    - you could even access code from another language
- the search space dictionary can have as many dimensions as you want
- via "add_search" you can run multiple different optimizations in parallel
In [5]:
fig = px.scatter(search_data, x="x", y="score")