#!/usr/bin/env python
# coding: utf-8
# # Running Tune experiments with Optuna
#
#
#
#
#
#
# In this tutorial we introduce Optuna, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with Optuna and, as a result, allow you to seamlessly scale up a Optuna optimization process - without sacrificing performance.
#
# Similar to Ray Tune, Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative ("how" over "what" emphasis), define-by-run style user API. With Optuna, a user has the ability to dynamically construct the search spaces for the hyperparameters. Optuna falls in the domain of "derivative-free optimization" and "black-box optimization".
#
# In this example we minimize a simple objective to briefly demonstrate the usage of Optuna with Ray Tune via `OptunaSearch`, including examples of conditional search spaces (string together relationships between hyperparameters), and the multi-objective problem (measure trade-offs among all important metrics). It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume `optuna>=3.0.0` library is installed. To learn more, please refer to [Optuna website](https://optuna.org/).
#
# Please note that sophisticated schedulers, such as `AsyncHyperBandScheduler`, may not work correctly with multi-objective optimization, since they typically expect a scalar score to compare fitness among trials.
#
# ## Prerequisites
# In[1]:
# !pip install "ray[tune]"
get_ipython().system('pip install -q "optuna>=3.0.0"')
# Next, import the necessary libraries:
# In[2]:
import time
from typing import Dict, Optional, Any
import ray
from ray import tune
from ray.tune.search import ConcurrencyLimiter
from ray.tune.search.optuna import OptunaSearch
# In[3]:
ray.init(configure_logging=False) # initialize Ray
# Let's start by defining a simple evaluation function.
# An explicit math formula is queried here for demonstration, yet in practice this is typically a black-box function-- e.g. the performance results after training an ML model.
# We artificially sleep for a bit (`0.1` seconds) to simulate a long-running ML experiment.
# This setup assumes that we're running multiple `step`s of an experiment while tuning three hyperparameters,
# namely `width`, `height`, and `activation`.
# In[4]:
def evaluate(step, width, height, activation):
time.sleep(0.1)
activation_boost = 10 if activation=="relu" else 0
return (0.1 + width * step / 100) ** (-1) + height * 0.1 + activation_boost
# Next, our `objective` function to be optimized takes a Tune `config`, evaluates the `score` of your experiment in a training loop,
# and uses `tune.report` to report the `score` back to Tune.
# In[5]:
def objective(config):
for step in range(config["steps"]):
score = evaluate(step, config["width"], config["height"], config["activation"])
tune.report({"iterations": step, "mean_loss": score})
# Next we define a search space. The critical assumption is that the optimal hyperparamters live within this space. Yet, if the space is very large, then those hyperparamters may be difficult to find in a short amount of time.
#
# The simplest case is a search space with independent dimensions. In this case, a config dictionary will suffice.
# In[6]:
search_space = {
"steps": 100,
"width": tune.uniform(0, 20),
"height": tune.uniform(-100, 100),
"activation": tune.choice(["relu", "tanh"]),
}
# Here we define the Optuna search algorithm:
# In[7]:
algo = OptunaSearch()
# We also constrain the number of concurrent trials to `4` with a `ConcurrencyLimiter`.
# In[8]:
algo = ConcurrencyLimiter(algo, max_concurrent=4)
# The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to `1000` samples.
# (you can decrease this if it takes too long on your machine).
# In[9]:
num_samples = 1000
# In[10]:
# We override here for our smoke tests.
num_samples = 10
# Finally, we run the experiment to `"min"`imize the "mean_loss" of the `objective` by searching `search_space` via `algo`, `num_samples` times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute `tuner.fit()`.
# In[11]:
tuner = tune.Tuner(
objective,
tune_config=tune.TuneConfig(
metric="mean_loss",
mode="min",
search_alg=algo,
num_samples=num_samples,
),
param_space=search_space,
)
results = tuner.fit()
# And now we have the hyperparameters found to minimize the mean loss.
# In[12]:
print("Best hyperparameters found were: ", results.get_best_result().config)
# ## Providing an initial set of hyperparameters
#
# While defining the search algorithm, we may choose to provide an initial set of hyperparameters that we believe are especially promising or informative, and
# pass this information as a helpful starting point for the `OptunaSearch` object.
# In[13]:
initial_params = [
{"width": 1, "height": 2, "activation": "relu"},
{"width": 4, "height": 2, "activation": "relu"},
]
# Now the `search_alg` built using `OptunaSearch` takes `points_to_evaluate`.
# In[14]:
searcher = OptunaSearch(points_to_evaluate=initial_params)
algo = ConcurrencyLimiter(searcher, max_concurrent=4)
# And run the experiment with initial hyperparameter evaluations:
# In[15]:
tuner = tune.Tuner(
objective,
tune_config=tune.TuneConfig(
metric="mean_loss",
mode="min",
search_alg=algo,
num_samples=num_samples,
),
param_space=search_space,
)
results = tuner.fit()
# We take another look at the optimal hyperparamters.
# In[16]:
print("Best hyperparameters found were: ", results.get_best_result().config)
# ## Conditional search spaces
#
# Sometimes we may want to build a more complicated search space that has conditional dependencies on other hyperparameters. In this case, we pass a define-by-run function to the `search_alg` argument in `ray.tune()`.
# In[17]:
def define_by_run_func(trial) -> Optional[Dict[str, Any]]:
"""Define-by-run function to construct a conditional search space.
Ensure no actual computation takes place here. That should go into
the trainable passed to ``Tuner()`` (in this example, that's
``objective``).
For more information, see https://optuna.readthedocs.io/en/stable\
/tutorial/10_key_features/002_configurations.html
Args:
trial: Optuna Trial object
Returns:
Dict containing constant parameters or None
"""
activation = trial.suggest_categorical("activation", ["relu", "tanh"])
# Define-by-run allows for conditional search spaces.
if activation == "relu":
trial.suggest_float("width", 0, 20)
trial.suggest_float("height", -100, 100)
else:
trial.suggest_float("width", -1, 21)
trial.suggest_float("height", -101, 101)
# Return all constants in a dictionary.
return {"steps": 100}
# As before, we create the `search_alg` from `OptunaSearch` and `ConcurrencyLimiter`, this time we define the scope of search via the `space` argument and provide no initialization. We also must specific metric and mode when using `space`.
# In[18]:
searcher = OptunaSearch(space=define_by_run_func, metric="mean_loss", mode="min")
algo = ConcurrencyLimiter(searcher, max_concurrent=4)
# Running the experiment with a define-by-run search space:
# In[19]:
tuner = tune.Tuner(
objective,
tune_config=tune.TuneConfig(
search_alg=algo,
num_samples=num_samples,
),
)
results = tuner.fit()
# We take a look again at the optimal hyperparameters.
# In[20]:
print("Best hyperparameters for loss found were: ", results.get_best_result("mean_loss", "min").config)
# ## Multi-objective optimization
#
# Finally, let's take a look at the multi-objective case. This permits us to optimize multiple metrics at once, and organize our results based on the different objectives.
# In[21]:
def multi_objective(config):
# Hyperparameters
width, height = config["width"], config["height"]
for step in range(config["steps"]):
# Iterative training function - can be any arbitrary training procedure
intermediate_score = evaluate(step, config["width"], config["height"], config["activation"])
# Feed the score back back to Tune.
tune.report({
"iterations": step, "loss": intermediate_score, "gain": intermediate_score * width
})
# We define the `OptunaSearch` object this time with metric and mode as list arguments.
# In[22]:
searcher = OptunaSearch(metric=["loss", "gain"], mode=["min", "max"])
algo = ConcurrencyLimiter(searcher, max_concurrent=4)
tuner = tune.Tuner(
multi_objective,
tune_config=tune.TuneConfig(
search_alg=algo,
num_samples=num_samples,
),
param_space=search_space
)
results = tuner.fit();
# Now there are two hyperparameter sets for the two objectives.
# In[23]:
print("Best hyperparameters for loss found were: ", results.get_best_result("loss", "min").config)
print("Best hyperparameters for gain found were: ", results.get_best_result("gain", "max").config)
# We can mix-and-match the use of initial hyperparameter evaluations, conditional search spaces via define-by-run functions, and multi-objective tasks. This is also true of scheduler usage, with the exception of multi-objective optimization-- schedulers typically rely on a single scalar score, rather than the two scores we use here: loss, gain.
# In[24]:
ray.shutdown()