(tune-comet-ref)=
Comet is a tool to manage and optimize the entire ML lifecycle, from experiment tracking, model optimization and dataset versioning to model production monitoring.
{image}
:align: center
:alt: Comet
:height: 120px
:target: https://www.comet.ml/site/
{contents}
:backlinks: none
:local: true
To illustrate logging your trial results to Comet, we'll define a simple training function
that simulates a loss
metric:
import numpy as np
from ray import train, tune
def train_function(config):
for i in range(30):
loss = config["mean"] + config["sd"] * np.random.randn()
train.report({"loss": loss})
Now, given that you provide your Comet API key and your project name like so:
api_key = "YOUR_COMET_API_KEY"
project_name = "YOUR_COMET_PROJECT_NAME"
# This cell is hidden from the rendered notebook. It makes the
from unittest.mock import MagicMock
from ray.air.integrations.comet import CometLoggerCallback
CometLoggerCallback._logger_process_cls = MagicMock
api_key = "abc"
project_name = "test"
You can add a Comet logger by specifying the callbacks
argument in your RunConfig()
accordingly:
from ray.air.integrations.comet import CometLoggerCallback
tuner = tune.Tuner(
train_function,
tune_config=tune.TuneConfig(
metric="loss",
mode="min",
),
run_config=train.RunConfig(
callbacks=[
CometLoggerCallback(
api_key=api_key, project_name=project_name, tags=["comet_example"]
)
],
),
param_space={"mean": tune.grid_search([1, 2, 3]), "sd": tune.uniform(0.2, 0.8)},
)
results = tuner.fit()
print(results.get_best_result().config)
2022-07-22 15:41:21,477 INFO services.py:1483 -- View the Ray dashboard at http://127.0.0.1:8267
/Users/kai/coding/ray/python/ray/tune/trainable/function_trainable.py:643: DeprecationWarning: `checkpoint_dir` in `func(config, checkpoint_dir)` is being deprecated. To save and load checkpoint in trainable functions, please use the `ray.air.session` API:
from ray.air import session
def train(config):
# ...
session.report({"metric": metric}, checkpoint=checkpoint)
For more information please see https://docs.ray.io/en/master/ray-air/key-concepts.html#session
DeprecationWarning,
Trial name | status | loc | mean | sd | iter | total time (s) | loss |
---|---|---|---|---|---|---|---|
train_function_5bf98_00000 | TERMINATED | 127.0.0.1:48140 | 1 | 0.405758 | 30 | 2.11758 | 1.02341 |
train_function_5bf98_00001 | TERMINATED | 127.0.0.1:48147 | 2 | 0.647335 | 30 | 0.0770731 | 1.53993 |
train_function_5bf98_00002 | TERMINATED | 127.0.0.1:48151 | 3 | 0.256568 | 30 | 0.0728431 | 3.0393 |
2022-07-22 15:41:24,693 INFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json']. COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting. COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/ COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting. COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/ COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting. COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/
Result for train_function_5bf98_00000: date: 2022-07-22_15-41-27 done: false experiment_id: c94e6cdedd4540e4b40e4a34fbbeb850 hostname: Kais-MacBook-Pro.local iterations_since_restore: 1 loss: 1.1009860426725162 node_ip: 127.0.0.1 pid: 48140 time_since_restore: 0.000125885009765625 time_this_iter_s: 0.000125885009765625 time_total_s: 0.000125885009765625 timestamp: 1658500887 timesteps_since_restore: 0 training_iteration: 1 trial_id: 5bf98_00000 warmup_time: 0.0029532909393310547 Result for train_function_5bf98_00000: date: 2022-07-22_15-41-29 done: true experiment_id: c94e6cdedd4540e4b40e4a34fbbeb850 experiment_tag: 0_mean=1,sd=0.4058 hostname: Kais-MacBook-Pro.local iterations_since_restore: 30 loss: 1.0234101880766688 node_ip: 127.0.0.1 pid: 48140 time_since_restore: 2.1175789833068848 time_this_iter_s: 0.0022211074829101562 time_total_s: 2.1175789833068848 timestamp: 1658500889 timesteps_since_restore: 0 training_iteration: 30 trial_id: 5bf98_00000 warmup_time: 0.0029532909393310547 Result for train_function_5bf98_00001: date: 2022-07-22_15-41-30 done: false experiment_id: ba865bc613d94413a37fe027123ba031 hostname: Kais-MacBook-Pro.local iterations_since_restore: 1 loss: 2.3754716847171182 node_ip: 127.0.0.1 pid: 48147 time_since_restore: 0.0001590251922607422 time_this_iter_s: 0.0001590251922607422 time_total_s: 0.0001590251922607422 timestamp: 1658500890 timesteps_since_restore: 0 training_iteration: 1 trial_id: 5bf98_00001 warmup_time: 0.0036537647247314453 Result for train_function_5bf98_00001: date: 2022-07-22_15-41-30 done: true experiment_id: ba865bc613d94413a37fe027123ba031 experiment_tag: 1_mean=2,sd=0.6473 hostname: Kais-MacBook-Pro.local iterations_since_restore: 30 loss: 1.5399275480220707 node_ip: 127.0.0.1 pid: 48147 time_since_restore: 0.0770730972290039 time_this_iter_s: 0.002664804458618164 time_total_s: 0.0770730972290039 timestamp: 1658500890 timesteps_since_restore: 0 training_iteration: 30 trial_id: 5bf98_00001 warmup_time: 0.0036537647247314453 Result for train_function_5bf98_00002: date: 2022-07-22_15-41-31 done: false experiment_id: 2efb6f3c4d954bcab1ea4083f138008e hostname: Kais-MacBook-Pro.local iterations_since_restore: 1 loss: 3.204653294422825 node_ip: 127.0.0.1 pid: 48151 time_since_restore: 0.00014400482177734375 time_this_iter_s: 0.00014400482177734375 time_total_s: 0.00014400482177734375 timestamp: 1658500891 timesteps_since_restore: 0 training_iteration: 1 trial_id: 5bf98_00002 warmup_time: 0.0030150413513183594 Result for train_function_5bf98_00002: date: 2022-07-22_15-41-31 done: true experiment_id: 2efb6f3c4d954bcab1ea4083f138008e experiment_tag: 2_mean=3,sd=0.2566 hostname: Kais-MacBook-Pro.local iterations_since_restore: 30 loss: 3.0393011150182865 node_ip: 127.0.0.1 pid: 48151 time_since_restore: 0.07284307479858398 time_this_iter_s: 0.0020139217376708984 time_total_s: 0.07284307479858398 timestamp: 1658500891 timesteps_since_restore: 0 training_iteration: 30 trial_id: 5bf98_00002 warmup_time: 0.0030150413513183594
2022-07-22 15:41:31,290 INFO tune.py:738 -- Total run time: 7.36 seconds (6.72 seconds for the tuning loop).
{'mean': 1, 'sd': 0.40575843135279466}
Ray Tune offers an integration with Comet through the CometLoggerCallback
,
which automatically logs metrics and parameters reported to Tune to the Comet UI.
Click on the following dropdown to see this callback API in detail:
{eval-rst}
.. autoclass:: ray.air.integrations.comet.CometLoggerCallback
:noindex: