This notebook present the most basic use of Grid2Op

Try me out interactively with: Binder


This notebook will cover some basic raw functionalities at first. It will then show how these raw functionalities are encapsulated in easy-to-use functions.

As we will see later, the recommended way to use these functionalities is to go through the Runner, and not through the instanciation of the different classes one after the other.

Execute the cell below by removing the # character if you use google colab !

Cell will look like:

!pip install grid2op[optional]  # for use with google colab (grid2Op is not installed by default)

In [ ]:
# !pip install grid2op[optional]  # for use with google colab (grid2Op is not installed by default)
In [ ]:
import os
import sys
import grid2op
In [ ]:
res = None
    from jyquickhelper import add_notebook_menu
    res = add_notebook_menu()
except ModuleNotFoundError:
    print("Impossible to automatically add a menu / table of content to this notebook.\nYou can download \"jyquickhelper\" package with: \n\"pip install jyquickhelper\"")

I) Creating an Environment

I.A) Default settings

We provide a function that will handle the creation of the Environment with default values in a single call of the function.

In this example we will use the rte_case14_redisp environment, in test mode.

To define/create it, we can call:

In [ ]:
env = grid2op.make("rte_case14_redisp", test=True)

NB By setting "test=True" in the above call, we only use the data for 2 different months for our environment. If you remove it, grid2op.make will attempt to download more data. By default, the data corresponding to this environment will be downloaded to your "home" directory, which corresponds to the location returned by this script:

print(f"grid2op dataset will be downloaded in \"{grid2op.get_current_local_dir()}\"")

Other environments can be used and are available through the "make" command. To get a list of the available environments, you can do:

In [ ]:
grid2op.list_available_remote_env()  # this only works if you have an internet connection

It is also possible to list the environments that you have already downloaded (if any).

NB : Downloading is automatic and is done the first time you call make with an environment that has not been already locally downloaded.

In [ ]:

For more customization on where the environment are located (e.g. changing the grid2op.get_current_local_dir() the documentation provides extra information at

I.B) Custom settings

Using the make function, you can pass additional arguments to customize the environment (this is useful for training):

  • param: The parameters used for the Environment. See grid2op.Parameters.Parameters.
  • backend : The backend to use for the computation. If provided, it must be an instance of the class grid2op.Backend.Backend.
  • action_class: The type of BaseAction that the BaseAgent will be able to perform. If provided, it must be a subclass of grid2op.BaseAction.BaseAction.
  • observation_class: The type of BaseObservation that the BaseAgent will receive. If provided, It must be a subclass of grid2op.BaseAction.BaseObservation.
  • reward_class: The type of reward signal that the BaseAgent will receive. If provided, It must be a subclass of grid2op.BaseReward.BaseReward.
  • gamerules_class: The type of "Rules" that the BaseAgent will need to comply with. Rules are here to model some operational constraints. If provided, it must be a subclass of grid2op.RulesChecker.BaseRules.
  • data_feeding_kwargs: A dictionnary that is used to build the data_feeding (chronics) objects.
  • chronics_class: The type of chronics that represents the dynamics of the created Environment. Usually they come from different folders.
  • data_feeding: The type of chronics handler you want to use.
  • volagecontroler_class: The type of grid2op.VoltageControler.VoltageControler to use.
  • chronics_path: The path where to look for the chronics dataset (optional).
  • grid_path: The path where the powergrid is located. If provided, it must be a string and point to a valid file on the hard drive.

For example, to set the number of maximum allowed substation changes per step:

In [ ]:
from grid2op.Parameters import Parameters

custom_params = Parameters()
custom_params.MAX_SUB_CHANGED = 1
env = grid2op.make("rte_case14_redisp", param=custom_params, test=True)

NB The make function is highly customizable. For example, you can change the reward that you are using:

from grid2op.Reward import L2RPNReward
env = grid2op.make(reward_class=L2RPNReward)

We also give the possibility to assess different rewards. This can be done with the following code:

from grid2op.Reward import L2RPNReward, FlatReward
env = grid2op.make(reward_class=L2RPNReward,
                   other_rewards={"other_reward" : FlatReward })

The results for these rewards can be accessed with the info object returned by env.step (info is the 4th object returned by env.step, as you can see below). See the official reward documentation here for more information.

II) Creating an Agent

An Agent is the name given to the "operator" / "bot" / "algorithm" that will perform some modifications of the powergrid when it faces some "observation".

Examples of Agents are provided in the grid2Op/Agent directory of the grid2Op code repository.

A deeper look at the different provided Agents can be found in the 05_StudyYourAgent notebook. We suppose here that we use the most simple Agent, the one that does nothing (DoNothingAgent).

In [ ]:
from grid2op.Agent import DoNothingAgent
my_agent = DoNothingAgent(env.action_space)

III) Assess how the Agent is performing (manually)

The performance of an Agent is assessed with the cumulated reward it receives over time. In this example, the cumulated reward is a FlatReward that simply computes how many time steps the Agent has sucessfully managed before breaking any rules. For more control over this reward, it is recommended to use look at the documentation of the Environment class.

More examples of rewards are also available on the official documentation or here.

In [ ]:
done = False
time_step = int(0)
cum_reward = 0.
obs = env.reset()
reward = env.reward_range[0]
max_iter = 10
while not done:
    act = my_agent.act(obs, reward, done) # chose an action to do, in this case "do nothing"
    obs, reward, done, info = env.step(act) # implement this action on the powergrid
    cum_reward += reward
    time_step += 1
    if time_step >= max_iter:

We can now evaluate how well this agent is performing:

In [ ]:
print("This agent managed to survive {} timesteps".format(time_step))
print("It's final cumulated reward is {}".format(cum_reward))

IV) More convenient ways to assess the performance of an agent

All the steps above have been detailed as a "quick start", to give an example of the main classes of the Grid2Op package. Having to code all of the above can be quite tedious, but offers a lot of flexibility.

Implementing all this before starting to evaluate an agent can be tiring. What we show here is a much shorter way to perfom all this. In this section we will exhibit 2 ways:

  • The quickest way, using the grid2op.main API, most suited when basic computations need to be carried out.
  • The recommended way using a Runner. This gives more flexibility than the grid2op.main API but can be harder to configure.

In this section, we assume the same as before:

  • The Agent is the "Do Nothing" agent
  • The Environment is the default Environment
  • PandaPower is used as the backend
  • The chronics comes from the files included in this package
  • etc.

IV.A) Using the grid2op.runner API

When only simple assessments need to be performed, the grid2op.main API is perfectly suited. This API can also be accessed with the command line:

python3 -m grid2op.main

We detail here its usage as an API, to assess the performance of a given Agent.

As opposed to building en environment from scratch (see the previous section), this requires much less effort: we don't need to initialize (instanciate) anything. Everything is carried out inside the Runner called by the main function.

We simulate 1 episode here (eg. we play one scenario until: either the agent does a game over, or the scenario ends), but this method would work too if we wanted to simulate more episodes.

In [ ]:
from grid2op.Runner import Runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)
res =, max_iter=max_iter)

A call of the single 2 lines above will:

  • Create a valid environment
  • Create a valid agent
  • Assess how well an agent performs on one episode.
In [ ]:
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
    msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
    msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
    msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)

This is particularly suited for evaluating different agents. For example, we can quickly evaluate a second agent. In the example below, we can import an agent class PowerLineSwitch whose job is to connect and disconnect the power lines in the power network. This PowerLineSwitch Agent will simulate the effect of disconnecting a powerline for each powerline in the powergrid, and take the best action found ie the one whose simulated effect is the best (its execution can take a long time, depending on the scenario and the amount of powerlines in the grid). The execution of the code below can take a little time.

In [ ]:
from grid2op.Agent import PowerLineSwitch
runner = Runner(**env.get_params_for_runner(), agentClass=PowerLineSwitch)
res =, max_iter=max_iter)
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
    msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
    msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
    msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)

It is also possible using this API to store the results for a detailed examination of the actions taken by the Agent. Note that writing on the hard drive has an overhead on the computation time.

To do this, only a simple argument needs to be added to the main function call (path_save, which indicates where the outcome of the experiment will be stored). An example can be found below :

In [ ]:
runner = Runner(**env.get_params_for_runner(),
path_save_expe = os.path.abspath("saved_experiment_donothing")
res =, max_iter=max_iter, path_save=path_save_expe)
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
    msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
    msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
    msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
In [ ]:
os.listdir(os.path.join(path_save_expe, "0"))

All the outcomes of the experiment are shown above. For more information, please don't hesitate to read the documentation of Runner.

NB: A lot more of information about Actions is provided in the (03_Action notebook. In the 04_TrainingAnAgent (last section), there is an quick example of how to read / write an action from a saved repository.

Using make and Runner makes it easy to assess the performance of a trained agent. Besides, the Runner has been particularly integrated with other tools and makes it easy to replay and analyse an episode after it is finished. It is the recommended method to use in grid2op for the evaluation.