Prepare Power Flow Data¶

This notebook presents the process of data preparation for building power flow cases. The process is separated into stages (see, DVC config):

"parse" --- extract necessary parameters from the raw data
"transform" --- combine and convert data to use in further steps
"prepare" --- compose the final dataset
"build" --- create power flow cases.

The presented pipeline processes mainly a dataset described in the paper "An Extended IEEE 118-Bus Test System With High Renewable Penetration" (aka "NREL-118") which was downloaded and saved here. Since the NREL-118 dataset is mostly intended for DC OPF tasks, it skips a lot of information that is not necessary to solve these kind of tasks. Thus, to append the data with missing info, the information about the IEEE-118 test system prepared by Illinois Institute of Technology (version of 2004) is also used as the primary source of NREL-118 (see the JEAS-118 dataset files here which were downloaded from the IIT index).

The following sections describe all the stages of data processing and all the decisions made when preparing the final dataset.

In [1]:

import os

from definitions import S_BASE_MVA, F_HZ
from src.data import parse_jeas118_buses
from src.data import parse_jeas118_lines
from src.data import parse_jeas118_loads
from src.data import parse_jeas118_trafos
from src.data import parse_nrel118_buses
from src.data import parse_nrel118_escalators_ts
from src.data import parse_nrel118_gens
from src.data import parse_nrel118_hydros_nondisp_ts
from src.data import parse_nrel118_hydros_ts
from src.data import parse_nrel118_lines
from src.data import parse_nrel118_loads_ts
from src.data import parse_nrel118_outages_ts
from src.data import parse_nrel118_solars_ts
from src.data import parse_nrel118_winds_ts
from src.data import prepare_branches
from src.data import prepare_buses
from src.data import prepare_gens
from src.data import prepare_gens_ts
from src.data import prepare_loads
from src.data import prepare_loads_ts
from src.data import transform_gens_escalated_ts
from src.data import transform_loads
from src.data import transform_outages_ts
from src.data import transform_gens
from src.data import transform_gens_ts
from src.power_flow.builders import PandaPowerFlowBuilder
import pandapower.plotting.plotly as pplotly


PATH_NREL118 = os.path.join("..", "data", "raw", "nrel118")
PATH_JEAS118 = os.path.join("..", "data", "raw", "jeas118")
PATH_MANUAL = os.path.join("..", "data", "raw", "manual")

Parsing, transforming, and preparing raw data¶

Buses¶

To build a power flow case, the following information about buses will be used:

name
region
rated voltage level
if the bus is in service or out of service
if the bus is a slack bus
bus coordinates for plots
min and max limits of the bus voltage

Let's load and parse bus data of the NREL-118 power system:

In [2]:

path_nrel118_buses = os.path.join(PATH_NREL118, "additional-files-mti-118", "Buses.csv")
nrel118_buses = parse_nrel118_buses(raw_data=path_nrel118_buses)
nrel118_buses.head(2)

Out[2]:

	bus_name	region	load_participation_factor
0	bus_001	r1	0.047169
1	bus_002	r1	0.018496

The NREL-118 dataset contains only names and regions of buses (load_participation_factor is for load modelling, see Section "Loads"). To add missing values, it is assumed the following:

all buses are in service
bus "bus_069" is considered to be a slack bus
rated voltage level of buses 8, 9, 10, 26, 30, 38, 63, 64, 65, 68, 81, 116 equals to 345 kV, the rest of buses has the voltage level of 138 kV. This corresponds to the transformer data from the JEAS-118 dataset.

Coordinates of buses were added manually after designing the power system plot.

Bus voltage limits are taken from the JEAS-118 data:

In [3]:

path_jeas_118_buses = os.path.join(PATH_JEAS118, "JEAS_IEEE118.docx")
jeas118_buses = parse_jeas118_buses(path_jeas_118_buses)
jeas118_buses.head(2)

Out[3]:

	bus_name	max_v_pu	min_v_pu
0	bus_001	1.05	0.94
1	bus_002	1.06	0.95

As a temporary assumption, bus voltage limits are set to 0.8 and 1.2 for min_v_pu and max_v_pu correspondingly (see, this script for details) to achieve the convergence of the OPF estimation. Thus, the final bus data look as follows:

In [4]:

path_bus_coordinates = os.path.join(PATH_MANUAL, "bus_coordinates.csv")
buses = prepare_buses(
    parsed_nrel118_buses=nrel118_buses,
    parsed_jeas118_buses=jeas118_buses,
    bus_coordinates=path_bus_coordinates,
)
buses.head(2)

Out[4]:

	bus_name	region	in_service	v_rated_kv	is_slack	min_v_pu	max_v_pu	x_coordinate	y_coordinate
0	bus_001	r1	True	138	False	0.8	1.2	626.0	-324.0
1	bus_002	r1	True	138	False	0.8	1.2	678.0	-324.0

Branches¶

Branches is a common term both for lines and transformers. The following parameters about branches will be used to build power flow cases:

name
start and end buses of the branch
number of parallel branch systems
resistance
reactance
active conductance
in service or out of service
maximum thermal current
transformation ratio if the branch is a transformer

Let's load and parse line data of the NREL-118 power system:

In [5]:

path_nrel118_lines = os.path.join(PATH_NREL118, "additional-files-mti-118", "Lines.csv")
nrel118_lines = parse_nrel118_lines(raw_data=path_nrel118_lines)
nrel118_lines.head(2)

Out[5]:

	branch_number	from_bus	to_bus	max_p_mw	x_pu	r_pu
0	1	bus_001	bus_002	600.0	0.0999	0.0303
1	2	bus_001	bus_003	600.0	0.0424	0.0129

Since the information about active conductance is skipped in the NREL-118 dataset, let's load it from the JEAS-118 dataset:

In [6]:

path_jeas118_lines = os.path.join(PATH_JEAS118, "JEAS_IEEE118.docx")
jeas118_lines = parse_jeas118_lines(raw_data=path_jeas118_lines)
jeas118_lines.head(2)

Out[6]:

	from_bus	to_bus	parallel	b_pu
0	bus_001	bus_002	1	0.02540
1	bus_001	bus_003	1	0.01082

In the NREL-118 dataset, transformers are presented as lines without values of transformation ratio. Therefore, these values will be loaded from JEAS-118 dataset:

In [7]:

path_jeas118_trafos = os.path.join(PATH_JEAS118, "JEAS_IEEE118.docx")
jeas118_trafos = parse_jeas118_trafos(raw_data=path_jeas118_trafos)
jeas118_trafos.head(2)

Out[7]:

	from_bus	to_bus	parallel	trafo_ratio_rel
0	bus_008	bus_005	1	0.985
1	bus_026	bus_025	1	0.960

Thus, the final branch data look as follows:

In [8]:

branches = prepare_branches(
    parsed_nrel118_lines=nrel118_lines,
    parsed_jeas118_lines=jeas118_lines,
    parsed_jeas118_trafos=jeas118_trafos,
    prepared_buses=buses,
)
branches.head(2)

Out[8]:

	branch_name	from_bus	to_bus	parallel	in_service	r_ohm	x_ohm	b_µs	trafo_ratio_rel	max_i_ka
0	branch_001_002_1	bus_001	bus_002	1	True	5.770332	19.024956	133.375341	NaN	2.510219
1	branch_001_003_1	bus_001	bus_003	1	True	2.456676	8.074656	56.815795	NaN	2.510219

It is assumed that branches are always in service.

Loads¶

Here is the list of used load variables:

name
bus where the load is located
active and reactive power of the load
if the load is in service

The information about a part of the regional active load located in each bus is stored in variable load_participation_factor in the bus data of the NREL-118 dataset:

In [9]:

nrel118_buses.head(2)

Out[9]:

	bus_name	region	load_participation_factor
0	bus_001	r1	0.047169
1	bus_002	r1	0.018496

Active load value of regions is stored in the time-series NREL-118 data:

In [10]:

path_nrel118_loads_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Load")
nrel118_loads_ts = parse_nrel118_loads_ts(raw_data=path_nrel118_loads_ts)
nrel118_loads_ts.head(2)

Out[10]:

	datetime	region_name	region_load
0	2024-01-01 00:00:00	r1	5698.083154
1	2024-01-01 00:00:00	r2	1967.417090

To calculate reactive power of loads, let's get the JEAS-118 data:

In [11]:

path_jeas118_loads = os.path.join(PATH_JEAS118, "JEAS_IEEE118.docx")
jeas118_loads = parse_jeas118_loads(raw_data=path_jeas118_loads)
jeas118_loads.head(2)

Out[11]:

	bus_name	p_mw	q_mvar
0	bus_001	54.14	8.66
1	bus_002	21.23	9.55

The JEAS-118 load data will help to estimate the power factor of each load and define its reactive power at each moment of time using time-series data of active demand:

In [12]:

transformed_loads = transform_loads(
    parsed_nrel118_buses=nrel118_buses, parsed_jeas118_loads=jeas118_loads
)
transformed_loads.head(2)

Out[12]:

	load_name	bus_name	region	load_participation_factor	load_power_factor
0	load_001	bus_001	r1	0.047169	0.987447
1	load_002	bus_002	r1	0.018496	0.911978

Thus, it is necessary to prepare two files with the load data. The first file will contain the load power variation over time, the other will contain basic load information (location, etc.). The final load dataset is built with the following assumptions:

Loads are always in service.
Demand of region R1 was reduced by 35% to achieve the convergence of the OPF estimations.

In [13]:

loads = prepare_loads(transformed_loads=transformed_loads)
loads.head(2)

Out[13]:

	load_name	bus_name
0	load_001	bus_001
1	load_002	bus_002

In [14]:

loads_ts = prepare_loads_ts(
    transformed_loads=transformed_loads, parsed_nrel118_loads_ts=nrel118_loads_ts
)
loads_ts.head(2)

Out[14]:

	datetime	load_name	in_service	p_mw	q_mvar
0	2024-01-01	load_001	True	174.701149	27.944439
1	2024-01-01	load_002	True	68.505820	30.816325

Generators¶

To build power flow cases, the following information about generators is used:

name
bus where the generator is located
active power of the generator
if the generator is in service
whether the output of this generator is optimized at the OPF stage
max and min limits of reactive power output
max and min limit of active power output

Let's start from parsing generator data from the NREL-118 dataset:

In [15]:

path_nrel118_gens = os.path.join(
    PATH_NREL118, "additional-files-mti-118", "Generators.csv"
)
nrel118_gens = parse_nrel118_gens(raw_data=path_nrel118_gens)
nrel118_gens.head(2)

Out[15]:

	gen_name	bus_name	opt_category	max_p_mw	min_p_mw
0	biomass_001	bus_012	day_ahead	3.0	0.9
1	biomass_002	bus_012	day_ahead	3.0	0.9

Next, time-series data from the NREL-118 dataset are parsed:

In [16]:

# Hydro gens
path_nrel118_hydros_ts = os.path.join(PATH_NREL118, "Input files", "Hydro")
nrel118_hydros_ts = parse_nrel118_hydros_ts(raw_data=path_nrel118_hydros_ts)
nrel118_hydros_ts.head(2)

Out[16]:

	datetime	gen_name	p_mw
0	2024-01-01 00:00:00	hydro_016	0.17696
1	2024-01-01 00:00:00	hydro_017	0.29862

In [17]:

# Solar gens
path_nrel118_solars_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Solar")
nrel118_solars_ts = parse_nrel118_solars_ts(raw_data=path_nrel118_solars_ts)
nrel118_solars_ts.head(2)

Out[17]:

	datetime	gen_name	p_mw
0	2024-01-01 00:00:00	solar_001	0.0
1	2024-01-01 00:00:00	solar_002	0.0

In [18]:

# Wind gens
path_nrel118_winds_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Wind")
nrel118_winds_ts = parse_nrel118_winds_ts(raw_data=path_nrel118_winds_ts)
nrel118_winds_ts.head(2)

Out[18]:

	datetime	gen_name	p_mw
0	2024-01-01 00:00:00	wind_001	0.458135
1	2024-01-01 00:00:00	wind_002	3.724274

In [19]:

# Non-dispatchable hydro gens
path_nrel118_hydros_nondisp_ts = os.path.join(
    PATH_NREL118,
    "additional-files-mti-118",
    "Hydro_nondipatchable.csv",
)
nrel118_hydros_nondisp_ts = parse_nrel118_hydros_nondisp_ts(
    raw_data=path_nrel118_hydros_nondisp_ts
)
nrel118_hydros_nondisp_ts.head(2)

Out[19]:

	datetime	gen_name	p_mw
0	2024-01-01 00:00:00	hydro_036	0.51
1	2024-01-01 00:00:00	hydro_037	2.23

Escalators used to adjust generation profile to seasons or other time for all the generators, except wind, solar, and hydro:

In [20]:

# Escalators data
path_nrel118_escalators_ts = os.path.join(
    PATH_NREL118, "additional-files-mti-118", "Escalators.csv"
)
nrel118_escalators_ts = parse_nrel118_escalators_ts(raw_data=path_nrel118_escalators_ts)
nrel118_escalators_ts.head(2)

Out[20]:

	datetime	gen_name	escalator_ratio
0	2024-01-01 00:00:00	biomass_001	0.35
1	2024-01-01 00:00:00	biomass_002	0.35

In [21]:

# Outages
path_nrel118_outages_ts = os.path.join(
    PATH_NREL118, "Input files", "Others", "GenOut.csv"
)
nrel118_outages_ts = parse_nrel118_outages_ts(raw_data=path_nrel118_outages_ts)
nrel118_outages_ts.head(2)

Out[21]:

	datetime	gen_name	in_outage
0	2024-01-01 00:00:00	PSH_001	False
1	2024-01-01 00:00:00	PSH_002	False

Next, some intermediate calculations are performed with the following assumptions:

Range of reactive generator output set from -0.3 to 0.7 of the max active output (see this script for details).
Min limit of active output of optimized gens is set to zero (see this script for details) to achieve the convergence of the OPF estimation.
Information about outages of some generators is missed. Therefore, such generators are considered to be always in service (see this script for details).
Unknown generators in the outage data are dropped (see this script for details).
Duplicated generator outages are considered to be typos, as a result of which the state of generator "internal_combustion_gas_001" was attributed to other generators (see this script for details).

In [22]:

transformed_gens = transform_gens(
    parsed_nrel118_gens=nrel118_gens,
    prepared_buses=buses,
)
transformed_outages_ts = transform_outages_ts(
    parsed_nrel118_outages_ts=nrel118_outages_ts
)
transformed_gens_escalated_ts = transform_gens_escalated_ts(
    transformed_gens=transformed_gens,
    parsed_nrel118_escalators_ts=nrel118_escalators_ts,
)
transformed_gens_ts = transform_gens_ts(
    transformed_gens=transformed_gens,
    parsed_nrel118_winds_ts=nrel118_winds_ts,
    parsed_nrel118_solars_ts=nrel118_solars_ts,
    parsed_nrel118_hydros_ts=nrel118_hydros_ts,
    parsed_nrel118_hydros_nondisp_ts=nrel118_hydros_nondisp_ts,
)

Finally, let's concat all datasets to build two files with generation data --- general generation info (location, etc.), time-series data (active output, whether the gen is in service, etc.):

In [23]:

gens_ts = prepare_gens_ts(
    transformed_gens=transformed_gens,
    transformed_gens_ts=transformed_gens_ts,
    transformed_outages_ts=transformed_outages_ts,
    transformed_gens_escalated_ts=transformed_gens_escalated_ts,
)
gens_ts.head(2)

Out[23]:

	datetime	gen_name	in_service	p_mw	max_q_mvar	min_q_mvar	max_p_mw	min_p_mw
0	2024-01-01	biomass_001	True	NaN	0.735	-0.315	1.05	0.0
1	2024-01-01	biomass_002	True	NaN	0.735	-0.315	1.05	0.0

In [24]:

gens = prepare_gens(transformed_gens=transformed_gens, prepared_gens_ts=gens_ts)
gens.head(2)

Out[24]:

	gen_name	bus_name	opt_category	max_p_mw	min_p_mw
0	biomass_001	bus_012	day_ahead	1.1199	0.0
1	biomass_002	bus_012	day_ahead	1.1199	0.0

Building power flow cases¶

After data preparation stages, there are still several parameters that are not defined but are needed to perform the power flow calculation:

set voltage of generators
set voltage of the slack bus

To calculate these parameters, the task of optimal power flow (OPF) is solved using the PandaPower engine. Since electricity prices are not used here for simplicity, the goal of the optimization is to minimize the total power generation.

Thus, the solver tries to tune active / reactive outputs of optimized generators (p_mw and q_mvar) and active / reactive injection of the slack bus (p_mw and q_mvar) with the following constraints:

active output of generators must be in possible limits: min_p_mw <= p_mw <= max_p_mw
reactive output of generators must be in possible limits: min_q_mvar <= q_mvar <= max_q_mvar
bus voltages must be in acceptable ranges: min_v_pu <= v_pu <= max_v_pu (v_pu --- actual bus voltage in per units)
loading of lines and transformers must not exceed 100% (calculated according to max_i_ka of each branch)
slack bus injections are considered unlimited.

Optimized generators can be set in PandaPower with the property controllable. A controllable generator can change active and reactive outputs along with its voltage, while a non-controllable generator can only tune its reactive output in order to keep the set voltage level (see, the docs). More details about the OPF solver used can be found in the PandaPower documentation.

Since generation voltage levels are unknown after the data preparation described above, all generators are considered controllable with the following assumptions:

generators that should be optimized have active power limits (used in the OPF) equal to their rated limits (min_p_mw and max_p_mw)
generators that should not be optimized have active power limits (used in the OPF) equal to their actual output (p_mw and p_mw).

In such a way, during the OPF, all generators can adjust their voltages and reactive power, but the active outputs are only adjusted in those generators that are intended for optimization.

After the OPF, found values of generator outputs and voltages are used to estimate power flows just to ensure the power system is stable.

In [25]:

# Create the builder
builder = PandaPowerFlowBuilder(f_hz=F_HZ, s_base_mva=S_BASE_MVA)

# Load data
builder.load_data(
    buses=buses,
    branches=branches,
    loads=loads,
    loads_ts=loads_ts,
    gens=gens,
    gens_ts=gens_ts,
)

# Build power flow case for the first timestamp
timestamp = builder.timestamps[0]
sample = builder.run(timestamp)

In [26]:

# Show the sample structure
sample

Out[26]:

This pandapower network includes the following parameter tables:
   - bus (118 elements)
   - load (91 elements)
   - gen (321 elements)
   - ext_grid (1 element)
   - line (177 elements)
   - trafo (9 elements)
   - bus_geodata (118 elements)
 and the following results tables:
   - res_bus (118 elements)
   - res_line (177 elements)
   - res_trafo (9 elements)
   - res_ext_grid (1 element)
   - res_load (91 elements)
   - res_gen (321 elements)

If the power flow calculation is successful, the sample contains tables "res_*" with the results of the calculation. The description of result tables is written in the documentation of each element (for example, description of "res_bus").

In [27]:

# Show res_bus table
sample.res_bus.head(5)

Out[27]:

	vm_pu	va_degree	p_mw	q_mvar
0	1.114680	-9.605718	174.701149	27.944439
1	1.148707	-5.777126	68.505820	30.816325
2	1.131708	-8.039705	133.591197	34.269047
3	1.197012	-2.646202	-23.409049	-35.890231
4	1.196556	-2.539347	0.000000	0.000000

Below is a power system plot that displays bus voltage levels and branch loadings. Plots can be customized to display other parameters or some additional annotations.

In [28]:

lines_trace = pplotly.create_line_trace(
    sample,
    cmap="plasma",
    cmin=0,
    cmax=100,
    cpos=1.1,
    width=2.0,
    cbar_title="Line loading, %",
)
trafos_trace = pplotly.create_trafo_trace(
    sample,
    cmap="plasma",
    cmin=0,
    cmax=100,
    width=6.0,
)
buses_trace = pplotly.create_bus_trace(
    sample,
    cmap="RdBu_r",
    cmin=0.9,
    cmax=1.1,
    size=10,
    cbar_title="Bus voltage, pu",
)
traces = buses_trace + lines_trace + trafos_trace
pplotly.draw_traces(traces, showlegend=False, figsize=1.2);

In the same way, the process of building power flow cases is performed for each timestamp presented in the prepared data. The range and frequency of timestamps can be customized via DATE_RANGE from the definitions.