Objectives
This notebook will cover some basic raw functionalities at first. It will then show how these raw functionalities are encapsulated in easy-to-use functions.
As we will see later, the recommended way to use these functionalities is to go through the Runner, and not through the instanciation of the different classes one after the other.
Cell will look like:
!pip install grid2op[optional] # for use with google colab (grid2Op is not installed by default)
# !pip install grid2op[optional] # for use with google colab (grid2Op is not installed by default)
import os
import sys
import grid2op
res = None
try:
from jyquickhelper import add_notebook_menu
res = add_notebook_menu()
except ModuleNotFoundError:
print("Impossible to automatically add a menu / table of content to this notebook.\nYou can download \"jyquickhelper\" package with: \n\"pip install jyquickhelper\"")
res
We provide a function that will handle the creation of the Environment with default values in a single call of the function.
In this example we will use the rte_case14_redisp
environment, in test mode.
To define/create it, we can call:
env = grid2op.make("rte_case14_redisp", test=True)
NB By setting "test=True" in the above call, we only use the data for 2 different months for our environment. If you remove it, grid2op.make will attempt to download more data. By default, the data corresponding to this environment will be downloaded to your "home" directory, which corresponds to the location returned by this script:
print(f"grid2op dataset will be downloaded in \"{grid2op.get_current_local_dir()}\"")
Other environments can be used and are available through the "make" command. To get a list of the available environments, you can do:
grid2op.list_available_remote_env() # this only works if you have an internet connection
It is also possible to list the environments that you have already downloaded (if any).
NB : Downloading is automatic and is done the first time you call make
with an environment that has not been already locally downloaded.
grid2op.list_available_local_env()
For more customization on where the environment are located (e.g. changing the grid2op.get_current_local_dir()
the documentation provides extra information at https://grid2op.readthedocs.io/en/latest/makeenv.html#cache-manipulation
Using the make
function, you can pass additional arguments to customize the environment (this is useful for training):
param
: The parameters used for the Environment. See grid2op.Parameters.Parameters
.backend
: The backend to use for the computation. If provided, it must be an instance of the class grid2op.Backend.Backend
.action_class
: The type of BaseAction that the BaseAgent will be able to perform. If provided, it must be a subclass of grid2op.BaseAction.BaseAction
.observation_class
: The type of BaseObservation that the BaseAgent will receive. If provided, It must be a subclass of grid2op.BaseAction.BaseObservation
.reward_class
: The type of reward signal that the BaseAgent will receive. If provided, It must be a subclass of grid2op.BaseReward.BaseReward
.gamerules_class
: The type of "Rules" that the BaseAgent will need to comply with. Rules are here to model some operational constraints. If provided, it must be a subclass of grid2op.RulesChecker.BaseRules
.data_feeding_kwargs
: A dictionnary that is used to build the data_feeding
(chronics) objects.chronics_class
: The type of chronics that represents the dynamics of the created Environment. Usually they come from different folders.data_feeding
: The type of chronics handler you want to use.volagecontroler_class
: The type of grid2op.VoltageControler.VoltageControler
to use.chronics_path
: The path where to look for the chronics dataset (optional).grid_path
: The path where the powergrid is located. If provided, it must be a string and point to a valid file on the hard drive.For example, to set the number of maximum allowed substation changes per step:
from grid2op.Parameters import Parameters
custom_params = Parameters()
custom_params.MAX_SUB_CHANGED = 1
env = grid2op.make("rte_case14_redisp", param=custom_params, test=True)
NB The make
function is highly customizable. For example, you can change the reward that you are using:
from grid2op.Reward import L2RPNReward
env = grid2op.make(reward_class=L2RPNReward)
We also give the possibility to assess different rewards. This can be done with the following code:
from grid2op.Reward import L2RPNReward, FlatReward
env = grid2op.make(reward_class=L2RPNReward,
other_rewards={"other_reward" : FlatReward })
The results for these rewards can be accessed with the info
object returned by env.step
(info
is the 4th object returned by env.step
, as you can see below). See the official reward documentation here for more information.
An Agent is the name given to the "operator" / "bot" / "algorithm" that will perform some modifications of the powergrid when it faces some "observation".
Examples of Agents are provided in the grid2Op/Agent directory of the grid2Op code repository.
A deeper look at the different provided Agents can be found in the 05_StudyYourAgent notebook. We suppose here that we use the most simple Agent, the one that does nothing (DoNothingAgent
).
from grid2op.Agent import DoNothingAgent
my_agent = DoNothingAgent(env.action_space)
The performance of an Agent is assessed with the cumulated reward it receives over time. In this example, the cumulated reward is a FlatReward that simply computes how many time steps the Agent has sucessfully managed before breaking any rules. For more control over this reward, it is recommended to use look at the documentation of the Environment class.
More examples of rewards are also available on the official documentation or here.
done = False
time_step = int(0)
cum_reward = 0.
obs = env.reset()
reward = env.reward_range[0]
max_iter = 10
while not done:
act = my_agent.act(obs, reward, done) # chose an action to do, in this case "do nothing"
obs, reward, done, info = env.step(act) # implement this action on the powergrid
cum_reward += reward
time_step += 1
if time_step >= max_iter:
break
We can now evaluate how well this agent is performing:
print("This agent managed to survive {} timesteps".format(time_step))
print("It's final cumulated reward is {}".format(cum_reward))
All the steps above have been detailed as a "quick start", to give an example of the main classes of the Grid2Op package. Having to code all of the above can be quite tedious, but offers a lot of flexibility.
Implementing all this before starting to evaluate an agent can be tiring. What we show here is a much shorter way to perfom all this. In this section we will exhibit 2 ways:
In this section, we assume the same as before:
When only simple assessments need to be performed, the grid2op.main API is perfectly suited. This API can also be accessed with the command line:
python3 -m grid2op.main
We detail here its usage as an API, to assess the performance of a given Agent.
As opposed to building en environment from scratch (see the previous section), this requires much less effort: we don't need to initialize (instanciate) anything. Everything is carried out inside the Runner called by the main
function.
We simulate 1 episode here (eg. we play one scenario until: either the agent does a game over, or the scenario ends), but this method would work too if we wanted to simulate more episodes.
from grid2op.Runner import Runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)
res = runner.run(nb_episode=1, max_iter=max_iter)
A call of the single 2 lines above will:
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
print(msg_tmp)
This is particularly suited for evaluating different agents. For example, we can quickly evaluate a second agent. In the example below, we can import an agent class PowerLineSwitch whose job is to connect and disconnect the power lines in the power network. This PowerLineSwitch Agent will simulate the effect of disconnecting a powerline for each powerline in the powergrid, and take the best action found ie the one whose simulated effect is the best (its execution can take a long time, depending on the scenario and the amount of powerlines in the grid). The execution of the code below can take a little time.
from grid2op.Agent import PowerLineSwitch
runner = Runner(**env.get_params_for_runner(), agentClass=PowerLineSwitch)
res = runner.run(nb_episode=1, max_iter=max_iter)
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
print(msg_tmp)
It is also possible using this API to store the results for a detailed examination of the actions taken by the Agent. Note that writing on the hard drive has an overhead on the computation time.
To do this, only a simple argument needs to be added to the main function call (path_save
, which indicates where the outcome of the experiment will be stored). An example can be found below :
runner = Runner(**env.get_params_for_runner(),
agentClass=PowerLineSwitch
)
path_save_expe = os.path.abspath("saved_experiment_donothing")
res = runner.run(nb_episode=1, max_iter=max_iter, path_save=path_save_expe)
print("The results are:")
for chron_name, _, cum_reward, nb_time_step, max_ts in res:
msg_tmp = "\tFor chronics located at {}\n".format(chron_name)
msg_tmp += "\t\t - cumulative reward: {:.2f}\n".format(cum_reward)
msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
print(msg_tmp)
os.listdir(os.path.join(path_save_expe, "0"))
All the outcomes of the experiment are shown above. For more information, please don't hesitate to read the documentation of Runner.
NB: A lot more of information about Actions is provided in the (03_Action notebook. In the 04_TrainingAnAgent (last section), there is an quick example of how to read / write an action from a saved repository.
Using make
and Runner
makes it easy to assess the performance of a trained agent. Besides, the Runner
has been particularly integrated with other tools and makes it easy to replay and analyse an episode after it is finished. It is the recommended method to use in grid2op for the evaluation.