Conventional Power Plants: Power Plants in Europe
This notebook is part of the Data package name here Data Package of Open Power System Data.

1. Script setup

Import of Python modules needed to process the data and creation of required output folders.

In [ ]:
import numpy as np
import pandas as pd
import os
import yaml
import json
import sqlite3
import hashlib
from download_and_process_functions import get_sha_hash
from download_and_process_functions import add_location_and_EIC


# create output folder if they do not exist
os.makedirs(os.path.join('output'), exist_ok=True)

# set data & input directory
data_directory = os.path.join('input','data')
locations_directory = os.path.join('input', 'locations')

2. Data import

2.1 Data sources

Unlike the previous releases of this package, where the data was partially downloaded within this script, all data in the current release is pre-downloaded and provided. The following states all relevant data sources.

In [ ]:
meta_data = """

    BE:
        filename: ProductionParkOverview.xls
        source: https://www.elia.be/en/grid-data/power-generation/generating-facilities#
        source_file: https://griddata.elia.be/eliabecontrols.prod/interface/fdn/download/generatingfacilities/xls
        filetype: xls
        date_of_access: Feb 2020
        manually_assembled: no
        provider: Elia
        institution: TSO
        
    NL:
        filename: export.csv
        source: https://www.tennet.org/english/operational_management/export_data.aspx
        source_file: www.tennet.org/english/operational_management/export_data.aspx?exporttype=installedcapacity&format=csv&quarter=2019-4&submit=3
        filetype: csv
        date_of_access: Feb 2020
        manually_assembled: no
        provider: Tennet
        institution: TSO

    FR:
        filename: Production_Capacities.csv
        source: https://www.services-rte.com/en/view-data-published-by-rte/production-installed-capacity.html
        source_file: NA
        filetype: csv
        date_of_access: Feb 2020
        manually_assembled: no
        provider: RTE
        institution: TSO
        
    PL:
        filename: units_list_2019_11_29_PL.csv
        source: http://gpi.tge.pl/en/wykaz-jednostek
        source_file: http://gpi.tge.pl/en/wykaz-jednostek
        date_of_access: Dec 2019
        manually_assembled: no
        provider: GPI Power Market Data
        institution: Information platform
        
    CZ:
        filename: 21915_2019.pdf
        source: https://www.ceps.cz/cs/priprava-provozu
        source_file: https://www.ceps.cz/cs/priprava-provozu
        date_of_access: Feb 2020
        manually_assembled: no
        provider: Ceps
        institution: TSO
        
    CH:
        filename: 2018  Statistik der Wasserkraftanlagen der Schweiz 31.12.2018.csv
        source: https://www.bfe.admin.ch/bfe/de/home/versorgung/statistik-und-geodaten/geoinformation/geodaten/wasser/statistik-der-wasserkraftanlagen.html
        source_file: https://www.bfe.admin.ch/bfe/de/home/versorgung/statistik-und-geodaten/geoinformation/geodaten/wasser/statistik-der-wasserkraftanlagen.html
        date_of access: Dec 2019
        manually_assempled: no
        provider: Swiss Federal office of energy
        institution: Federal Administration
        
    IT:
        filename: 18.xlsx
        source: http://www2018.terna.it/it-it/sistemaelettrico/transparencyreport/generation/installedgenerationcapacity.aspx
        source_file: http://download.terna.it/terna/0000/0216/18.XLSX
        date_of_access: Feb 2020
        manually_assembled: no
        provider: Terna
        institution: TSO

    FI:
        filename: Energiaviraston voimalaitosrekisteri.csv
        source: https://energiavirasto.fi/toimitusvarmuus
        source_file: https://energiavirasto.fi/toimitusvarmuus
        date_of_access: Feb 2020
        manually_assembled: no
        provider: energiavirasto
        institution: energy agency
        
    ES:
        filename: Registro_16_12_2019.csv 
        source: https://sede.minetur.gob.es/en-US/datosabiertos/catalogo/registro-productores-electrica
        source_file: https://sede.minetur.gob.es/en-US/datosabiertos/catalogo/registro-productores-electrica
        date_of_access: Dec 2019 
        manually_assembled: no 
        provider: 
        institution: 
        
    UK:
        filename: DUKES_5.11_UK.csv
        source: https://www.gov.uk/government/statistics/electricity-chapter-5-digest-of-united-kingdom-energy-statistics-dukes#content
        source_file: https://www.gov.uk/government/statistics/electricity-chapter-5-digest-of-united-kingdom-energy-statistics-dukes#content
        date_of_access: Dec 2019
        manually_assembled: no
        provider: UK gov
        institution: UK statistics

    NO:
        filename_thermal: termiske-kraftverk-i-norge-2019.xlsx
        filename_hydro: Vannkraftverk.csv
        source: https://www.nve.no/
        source_file_thermal: https://www.nve.no/media/8967/termiske-kraftverk-i-norge-2019.xlsx
        source_file_hydro: https://www.nve.no/energiforsyning/kraftproduksjon/vannkraft/vannkraftdatabase/#
        date_of_access: Feb 2020
        manually_assembled: no
        provider: Norwegian Water Resources and Energy Directorate
        institution: Ministry

    SE:
        filename: input_plant-list_SE.csv
        source: https://www.nordpoolgroup.com/
        source_file: NA
        date_of_access: 2014
        manually_assembled: no
        provider: Nordpool Group
        institution: Market operator

    SK:
        filename: input_plant-list_SK.csv
        source: https://www.seas.sk/thermal-power-plants
        source_file: https://www.seas.sk/thermal-power-plants
        date_of_access: Feb 2020
        manually_assembled: yes 
        provider: Slovenské elektrárne
        institution: joint-stock company
                      
    SI:
        filename: input_plant-list_SI.csv
        source: multiples (in document)
        source_file: 
        date_of_access: Dec 2019
        manually_assembled: yes
        provider: multiples (in document)
        institution: Private company
                              
    AT:
        filename_thermal: input_plant-list_AT_thermal.csv
        filename_hydro: input_plant-list_AT_hydro.csv
        source: multiples (in document)
        source_file:
        date_of_access: Feb 2020
        manually_assembled: yes
        provider: multiples (in document)
        institution: multiples (in document)                                   
                                              
    DK:
        filename: input_plant-list_DK.csv 
        source: multiples (in document)
        source_file: 
        date_of_access: Jan 2020
        manually_assembled: yes
        provider: multiples (in document)
        institution: multiples (in document)
        
"""

# Conversion to JSON (if needed)
# meta_data = yaml.load(meta_data, Loader=yaml.BaseLoader)

2.2 Definition of harmonized output scheme

To provide a standardizes set of power plant information among all national data sources, a set of required columns is defined which is subsequently filled with available data. The following columns and their structure are the basis for all national data sources.

Note: If information for specific columns are not available, the data entry is empty. On the other hand, if the national data sources provides other information than required by the scheme, these information are not processed.

In [ ]:
columns_sorted = ['name',
                  'company',
                  'street',
                  'postcode',
                  'city',
                  'country',
                  'capacity',
                  'energy_source',
                  'technology',
                  'chp',
                  'commissioned',
                  'type',
                  'lat',
                  'lon',
                  'eic_code',
                  'additional_info',
                  'comment',
                  'source']

3. Data processing per country

3.1 Belgium BE

3.1.1 Data import

The data is provided by the Belgian transmission network operator ELIA. It encompasses a detailed list of Belgian generation units with comprehensive information on technologies and energy fuels.

In [ ]:
filepath_BE = os.path.join(data_directory, 'BE','ProductionParkOverview.xls')
data_BE = pd.read_excel(filepath_BE,
                       sheet_name='ProductionParkOverview',
                       skiprows=1)

data_BE.head()

3.1.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_BE = {'ARP': 'company',
                   'Generation plant': 'name',
                   'Plant Type': 'technology',
                   'Technical Nominal Power (MW)': 'capacity',
                   'Remarks': 'comment',
                   'Fuel': 'energy_source',
                   'Country': 'country',
                   'Source': 'source'}


# Apply general template of columns
data_BE = data_BE.rename(columns=dict_columns_BE).reindex(columns=columns_sorted)

# Drop rows without capacity entries, so that row with 
# "Unit connected to Distribution Grid" is dropped
data_BE.dropna(subset=['capacity'], inplace=True)

# Adjust types of entries in all columns
data_BE.capacity = data_BE.capacity.astype(float)

3.1.3 Definition of generation type

The generation type provides information on the 'usage' of the power plants (beside electricity generaiton), thus if the plant is an industrial power plant or provides thermal heat for district heating.

The Belgian data source provides only a general information on the heat supply (here: WKK). Thus, due to these general informaiton, we classify corresponding plants as both, industrial or combined heat power plant, and cannot distringuish both types.

In [ ]:
# Generate entries in column "type" according to technology "WKK"
data_BE.loc[data_BE['technology'] == 'WKK', 'type'] = 'CHP/IPP'
data_BE.loc[data_BE['name'].str.contains('WKK'), 'type'] = 'CHP/IPP'

# Generate entries in column "CHP" according to column "type"
data_BE.loc[(data_BE['type'] == 'CHP') |
            (data_BE['type'] == 'IPP') |
            (data_BE['type'] == 'CHP/IPP'), 'chp'] = 'Yes'

3.1.4 Definition of generation technology types

Overall translation of all technology types mentioned in the column "technology".

In [ ]:
# Translate technologies
dict_technology_BE = {'GT': 'Gas turbine',
                      'BG': np.nan,
                      'CL': 'Steam turbine',
                      'WKK': np.nan,
                      'CCGT': 'Combined cycle',
                      'D': np.nan,
                      'HU': np.nan,
                      'IS': np.nan,
                      'NU': 'Steam turbine',
                      'TJ': 'Gas turbine',
                      'WT': np.nan,
                      ' ': np.nan,
                      'nan': np.nan,
                     }
data_BE["technology"].replace(dict_technology_BE, inplace=True)

# add technology parameter for steam and gas turbines
data_BE.loc[data_BE['name'].str.contains('ST') &
            data_BE['technology'].isna(), 'technology'] = 'Steam turbine'

data_BE.loc[data_BE['name'].str.contains('GT') &
            data_BE['technology'].isna(), 'technology'] = 'Gas turbine'

data_BE.head()

3.1.5 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_sources" and subsequent translation check. Deletion of rows containing "wind" as energy source.

In [ ]:
# Translate energy sources
dict_energysources_BE = {'BIO': 'Biomass and biogas',
                         'BF': 'Other fossil fuels',
                         'CL': 'Lignite',
                         'CP': 'Hard coal',
                         'CG': 'Other fossil fuels',
                         'GO': 'Oil',
                         'LF': 'Oil',
                         'LV': 'Oil',
                         'CP/BF': 'Mixed fossil fuels',
                         'CP/CG': 'Mixed fossil fuels',
                         'FA/BF': 'Mixed fossil fuels',
                         'NG/BF': 'Mixed fossil fuels',
                         'NG': 'Natural gas',
                         'NU': 'Nuclear',
                         'WR': 'Non-renewable waste',
                         'WA': 'Hydro',
                         'WI': 'Wind',
                         'WP': 'Biomass and biogas'}
data_BE["energy_source"].replace(dict_energysources_BE, inplace=True)
data_BE["energy_source"].replace('NaN', np.nan, inplace=True)


# Delete unwanted energy source
data_BE = data_BE[data_BE.energy_source != 'Wind']

3.1.6 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_BE = add_location_and_EIC('BE', data_BE)

# add source
data_BE["source"] = "https://www.elia.be/en/grid-data/power-generation/generating-facilities#"

data_BE.head()

3.2 The Netherlands NL

3.2.1 Data import and merger

The data is provided by the Dutch transmission network operator TenneT. It encompasses the daily available generation capacity, thus a list of Dutch generation units being operational on a specific day. The data is downloaded for all four quarters in 2018.

Import of quartely data

In [ ]:
filepath_NL = os.path.join(data_directory, 'NL', 'export.csv')
data_NL = pd.read_csv(filepath_NL, encoding='utf-8')

data_NL.head()

3.2.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Merge columns "street" and "Number" to one column called "Street"
data_NL['street'] = data_NL[['street', 'Number']].apply(
    lambda x: '{} {}'.format(x[0], x[1]), axis=1)

# Drop columns not needed anymore
colsToDrop = ['Location', 'Date', 'Number']
data_NL = data_NL.drop(colsToDrop, axis=1)

# Rename columns
dict_columns_NL = {'Connected body': 'company',
                   'Entity': 'name',
                   'Fuel': 'energy_source',
                   'Capacity': 'capacity',
                   'zipcode': 'postcode',
                   'place-name': 'city'}
data_NL.rename(columns=dict_columns_NL, inplace=True)

# Adjust types of entries in all columns
data_NL.capacity = data_NL.capacity.astype(float)

3.2.3 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_sources". Generation of entries for the column "technology" according to information given in the column "energy_source" by TenneT.

In [ ]:
# Rename types of energy sources
dict_energysources_NL = {'E01': 'Solar',
                         'E02': 'Wind',
                         'E03': 'Hydro',
                         'E04': 'Biomass and biogas',
                         'E05': 'Hard coal',
                         'E06': 'Natural gas',
                         'E07': 'Oil',
                         'E08': 'Nuclear',
                         'E09': 'Other or unspecified energy sources'}

data_NL["energy_source"].replace(dict_energysources_NL, inplace=True)

# Generate technology entry according to energy source
data_NL.loc[data_NL['energy_source'] == 'Nuclear',
            'technology'] = 'Steam turbine'
data_NL.loc[data_NL['energy_source'] == 'Hard coal',
            'technology'] = 'Steam turbine'

# Delete unwanted energy sources in column "energy_source"
data_NL = data_NL[data_NL.energy_source != 'Solar']
data_NL = data_NL[data_NL.energy_source != 'Wind']

3.2.4 Select daily entry with highest available capacity

We estimate the installed capacity by the highest available daily capacity for each unit.

In [ ]:
# Filter rows by considering "name" and maximum "capacity
data_NL = data_NL.sort_values(
    'capacity', ascending=False).groupby('name', as_index=False).first()

# Apply general template of columns
data_NL = data_NL.reindex(columns=columns_sorted)

3.2.5 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_NL = add_location_and_EIC('NL', data_NL)

# add source
data_NL["source"] = "https://www.tennet.org/english/operational_management/export_data.aspx"

data_NL.head()

3.3 France FR

3.3.1 Data import

The data is provided by the French transmission network operator RTE. It encompasses a detailed list of French generation units with a capacity of more than 100 MW.

In [ ]:
filepath_FR = os.path.join(data_directory, 'FR', 'Production_Capacities.csv')
data_FR = pd.read_csv(filepath_FR)

data_FR.head()

3.3.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_FR = {'Type': 'energy_source',
                   'Name': 'name',
                   'Installed capacity (MW)': 'capacity',
                   'Start date of the current version': 'commissioned',
                   'Location': 'country'
                   }
data_FR.rename(columns=dict_columns_FR, inplace=True)

# Apply general template of columns
data_FR = data_FR.reindex(columns=columns_sorted)

# Delete place holder datetime
data_FR["commissioned"].replace('01/01/2000', np.nan, inplace=True)

# Map commissioned year to Timestamp col
data_FR['commissioned_year'] = pd.to_datetime(data_FR['commissioned'], format='%d/%m/%Y')
# Reassing commissioned col with year only
mask = data_FR['commissioned_year'].notna()
data_FR.loc[mask, 'commissioned'] = data_FR.loc[mask].commissioned_year.apply(lambda x: x.year)
# Drop not needed col
data_FR.drop('commissioned_year', axis=1, inplace=True)

# Adjust types of entries in all columns
data_FR.capacity = data_FR.capacity.astype(float)

3.3.4 Definition of energy sources and generation of technology types

Generation of entries for technologies. Overall translation of all energy sources types mentioned in the column "energy_sources" and subsequent translation check.

In [ ]:
# Generate technology entries according to energy sources
data_FR.loc[data_FR['energy_source'] == 'Pumping',
            'technology'] = 'Pumped storage'
data_FR.loc[data_FR['energy_source'] == 'Hydraulic over water / guided through',
            'technology'] = 'Run-of-river'
data_FR.loc[data_FR['energy_source'] == 'Hydraulic lakes',
            'technology'] = 'Reservoir'
data_FR.loc[data_FR['energy_source'] == 'Nuclear',
            'technology'] = 'Steam turbine'
data_FR.loc[data_FR['energy_source'] == 'Hard coal',
            'technology'] = 'Steam turbine'

# Translate types of energy sources
dict_energysources_FR = {'Other': 'Other or unspecified energy sources',
                         'Gas': 'Natural gas',
                         'Pumping': 'Hydro',
                         'Hydraulic over water / guided through': 'Hydro',
                         'Hydraulic lakes': 'Hydro',
                         'Biomass': 'Biomass and biogas'}
data_FR["energy_source"].replace(dict_energysources_FR, inplace=True)


# Delete unwanted energy sources in column "energy_source"
data_FR = data_FR[data_FR.energy_source != 'Wind']
data_FR = data_FR[data_FR.energy_source != 'Solar']
data_FR = data_FR[data_FR.energy_source != 'Marine']

3.3.5 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_FR = add_location_and_EIC('FR', data_FR)

# add source
data_FR["source"] = "https://www.services-rte.com/en/view-data-published-by-rte/production-installed-capacity.html"

data_FR.head()

3.4 Poland PL

3.4.1 Data import

The data is provided by the Polish Power Exchange GPI. It encompasses a detailed list of large Polish generation units with information on the single power plant blocks.

In [ ]:
filepath_PL = os.path.join(data_directory, 'PL', 'units_list_2019_11_29_PL.csv')
data_PL = pd.read_csv(filepath_PL, sep=';')

data_PL.head()

3.4.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Rename columns
dict_columns_PL = {'Generating unit name': 'name',
                   'Comments': 'comment',
                   'Available capacity [MW]': 'capacity',
                   'Basic fuel': 'energy_source',
                   'Country': 'country',
                   'Source': 'source'}
In [ ]:
# Rename first column
data_PL.columns.values[0] = 'company'

# Rename columns
dict_columns_PL = {'Generating unit name': 'name',
                   'Comments': 'comment',
                   'Available capacity [MW]': 'capacity',
                   'Basic fuel': 'energy_source',
                   'Country': 'country',
                   'Source': 'source'}
data_PL = data_PL.rename(columns=dict_columns_PL)

# Fill columns "energy_source" and "company" with the belonging entries
cols = ['energy_source', 'company']
data_PL[cols] = data_PL[cols].ffill()

# Delete empty and therefore unwanted rows by referring to column "Generating unit code"
data_PL = data_PL.dropna(subset=['Generating unit code'])

# Apply general template of columns
data_PL = data_PL.reindex(columns=columns_sorted)

# Adjust types of entries in all columns
data_PL.capacity = data_PL.capacity.astype(float)

3.4.3 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_sources".

In [ ]:
# Rename energy sources types
dict_energysources_PL = {'Brown coal': 'Lignite',
                         'Black coal': 'Hard coal',
                         'Water': 'Hydro',
                         'Natural gas': 'Natural gas',
                        }
data_PL["energy_source"].replace(dict_energysources_PL, inplace=True)

3.4.4 Definition of generation technology types

Generation of entries for the column "technology" according to information given in the column "energy_source".

In [ ]:
# Generate entries in column "technology" according to energy source "hydro"
data_PL.loc[data_PL['energy_source'] == 'Hydro', 'technology'] = 'Pumped storage'

3.4.5 Additional information on further power plants, geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_PL = add_location_and_EIC('PL', data_PL)

# add source
data_PL["source"] = "http://gpi.tge.pl/en/wykaz-jednostek"

data_PL.head()

3.5 Czech Republic CZ

3.5.1 Data import

The data is provided by the Czech transmission network operator CEPS. It encompasses the daily available capacity reported by the transmission system operator.

In [ ]:
filepath_CZ = os.path.join(data_directory, 'CZ', '21915_2019.csv')
data_CZ = pd.read_csv(filepath_CZ, encoding='utf-8')

data_CZ.head()

3.5.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Remove white space from names
data_CZ['název elektrárny'] = data_CZ['název elektrárny'].str.strip()
data_CZ['oznacení bloku'] = data_CZ['oznacení bloku'].str.strip()
data_CZ['Typ'] = data_CZ['Typ'].str.strip()
data_CZ['Palivo'] = data_CZ['Palivo'].str.strip()

# Insert dummy G1 where plant block is NA
data_CZ.loc[data_CZ['oznacení bloku'].isna(), 'oznacení bloku'] = 'G1'

# Merge name and block to one column called "name"
data_CZ['name'] = data_CZ['název elektrárny'].map(str) + ' ' + data_CZ['oznacení bloku']

# Rename columns
dict_columns_CZ = {'Typ': 'technology',
                   'Palivo': 'energy_source',
                   'výkon instalovaný (MW)': 'capacity'
                   }
data_CZ.rename(columns=dict_columns_CZ, inplace=True)

# Apply general template of columns
data_CZ = data_CZ.reindex(columns=columns_sorted)

# Adjust types of entries in all columns
data_CZ.capacity = data_CZ.capacity.astype(float)

3.5.3 Definition of generation technology types

Overall translation of all technology types mentioned in the column "technology".

In [ ]:
# Translate energy source
dict_energy_source_CZ = {'VODA': 'Hydro',
                        'PLYN': 'Natural gas',
                        'OLEJ': 'Oil',
                        'URAN': 'Nuclear',
                        'HU': 'Lignite',
                        'CU': 'Hard coal',
                        'BIO': 'Bioenergy'}
data_CZ["energy_source"].replace(dict_energy_source_CZ, inplace=True)

# Translate technologies
dict_technologies_CZ = {'PE': 'Steam turbine',
                        'PPE': 'Combined cycle',
                        'PSE': 'Combined cycle',
                        'JE': 'Steam turbine',
                        'VE': np.nan,
                        'PVE': 'Pumped storage'}
data_CZ["technology"].replace(dict_technologies_CZ, inplace=True)

3.5.4 Additional information on further power plants, geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_CZ = add_location_and_EIC('CZ', data_CZ)

# add source
data_CZ["source"] = "https://www.ceps.cz/cs/priprava-provozu"
    
data_CZ.head()

3.6 Switzerland CH

3.6.1 Data import

The data is provided by the Swiss Ministry of Energy BFE. It encompasses a detailed list of Swiss hydro generation units with comprehensive information on technical specifications. The list of nuclear generators is manually assembled and provided separately.

In [ ]:
filepath_CH_hydro = os.path.join(data_directory, 'CH', '2018  Statistik der Wasserkraftanlagen der Schweiz 31.12.2018.csv')
data_CH = pd.read_csv(filepath_CH_hydro, error_bad_lines=False, sep=';',decimal=',')
In [ ]:
filepath_CH_nuclear = os.path.join(data_directory, 'CH', 'input_plant-list_CH_nuclear.csv')
data_nuclear_CH = pd.read_csv(filepath_CH_nuclear, encoding='utf-8', header=0, index_col=None)

3.6.2 Processing of Hydro generator list

In this section, the imported generator list of hydro generators is standardized.

Consolidation of columns

In [ ]:
# Merge columns "ZE-Erste Inbetriebnahme" and "ZE-Letzte Inbetriebnahme" to one column called "Commissioned"
data_CH['commissioned'] = data_CH[
    ['ZE-Erste-Inbetriebnahme', 'ZE-Letzte-Inbetriebnahme']].apply(
        lambda x: max(x[0], x[1]), axis=1)

# Merge columns "Bemerkung (1) - (10)" to one column "Comment"
data_CH['comment'] = data_CH[['Bemerkung (1)',
                              'Bemerkung (2)',
                              'Bemerkung (3)',
                              'Bemerkung (4)',
                              'Bemerkung (5)',
                              'Bemerkung (6)',
                              'Bemerkung (7)',
                              'Bemerkung (8)',
                              'Bemerkung (9)',
                              'Bemerkung (10)']].apply(
                                  lambda x:
                                      '{}; {}; {}; {}; {}; {}; {}; {}; {}; {}'.format(
                                          x[0],
                                          x[1],
                                          x[2],
                                          x[3],
                                          x[4],
                                          x[5],
                                          x[6],
                                          x[7],
                                          x[8],
                                          x[9]), axis=1)

data_CH['comment'] = data_CH['comment'].str.replace('nan; ', '')
data_CH['comment'] = data_CH['comment'].str.replace('nan', '')

Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_CH = {'WKA-Name': 'name',
                   'ZE-Standort': 'city',
                   'WKA-Typ': 'technology',
                   'ZE-Status': 'availability',
                   'Inst. Turbinenleistung': 'capacity'}
data_CH.rename(columns=dict_columns_CH, inplace=True)

# Adjust type of entries in column "Capacity"
data_CH.capacity = data_CH.capacity.astype(float)

# Adjust availabilities
dict_availabilities_CH = {'im Normalbetrieb': '1',
                          'im Bau': '0',
                          'im Umbau': '0',
                          'stillgelegt': '0'}
data_CH["availability"].replace(dict_availabilities_CH, inplace=True)

# List only operating plants
data_CH = data_CH[data_CH.availability != '0']

# Apply general template of columns
data_CH = data_CH.reindex(columns=columns_sorted)

Definition of generation technology types

In [ ]:
# Set energy source to "hydro"
data_CH['energy_source'] = 'Hydro'

# Adjust technologies
dict_technologies_CH = {'L': 'Run-of-river',
                        'S': 'Reservoir',
                        'P': 'Pumped storage with natural inflow',
                        'U': 'Pumped storage'}
data_CH["technology"].replace(dict_technologies_CH, inplace=True)

3.6.3 Merge hydro and nuclear power plant data

In [ ]:
# add source for hydro
data_CH["source"] = "https://www.bfe.admin.ch/bfe/de/home/versorgung/statistik-und-geodaten/geoinformation/geodaten/wasser/statistik-der-wasserkraftanlagen.html"

# Concat dataframes
data_CH = pd.concat([data_CH, data_nuclear_CH], ignore_index=True, sort=False)

data_CH.head()

3.6.4 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual hydro power plants.

In [ ]:
data_CH = add_location_and_EIC('CH', data_CH)

data_CH.head()

3.7 Italy IT

3.7.1 Data import

The data is provided by the Italian transmission network operator TERNA. It encompasses a detailed list of Italian generation units of more than 100 MW.

In [ ]:
filepath_IT = os.path.join(data_directory, 'IT', '18.XLSX')
data_IT = pd.read_excel(filepath_IT, sheet_name='UPR PmaxOver 100MW') 

data_IT.head()

3.7.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_IT = {'Descrizione Impianto': 'name',
                   'TIPOLOGIA': 'energy_source',
                   'Comune': 'city',
                   'PMAX [MW]': 'capacity',
                   'Country': 'country',
                   'Source': 'source',
                   'Zona': 'additional_info'}
data_IT.rename(columns=dict_columns_IT, inplace=True)

# Apply general template of columns
data_IT = data_IT.reindex(columns=columns_sorted)

# Consider of geographical information in column "additional_info"
data_IT['additional_info'] = data_IT[['additional_info']].apply(
    lambda x: 'Zone: {}'.format(x[0]), axis=1)

# Adjust types of entries in all columns
data_IT.capacity = data_IT.capacity.astype(float)

3.7.3 Definition of energy sources

Overall translation of all energy source types mentioned in the column "energy_sources". Deletion of rows containing "wind" and "geothermal_power"as energy source.

In [ ]:
# Translate types of energy sources
dict_energysources_IT = {'GEOTERMICO': 'Geothermal',
                         'TERMOELETTRICO': 'Mixed fossil fuels',
                         'IDROELETTRICO': 'Hydro',
                         'EOLICO': 'Wind'}
data_IT["energy_source"].replace(dict_energysources_IT, inplace=True)

# Delete unwanted energy sources in column "energy_source"
data_IT = data_IT[data_IT.energy_source != 'Wind']
data_IT = data_IT[data_IT.energy_source != 'Geothermal']

3.7.4 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_IT = add_location_and_EIC('IT', data_IT)

# add source
data_IT["source"] = "http://www2018.terna.it/it-it/sistemaelettrico/transparencyreport/generation/installedgenerationcapacity.aspx"

data_IT.head()

3.8 Finland FI

3.8.1 Data import

The data is provided by the Finnish Energy Authority. It encompasses a detailed list of Finnish generation units of at least one megavolt ampere [1 MVA].

In [ ]:
filepath_FI = os.path.join(data_directory, 'FI', 'Energiaviraston voimalaitosrekisteri.csv')
data_FI = pd.read_csv(filepath_FI, sep=';')  

data_FI.head()

3.8.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Generate entries in column "CHP"
data_FI.loc[data_FI[
    'Combined Heat and Power Production, Industry,Maximum, Total, MW'] > 0,
    'chp'] = 'Yes'
data_FI.loc[data_FI[
    'Combined Heat and Power Production, District Heating, Total, MW'] > 0,
    'chp'] = 'Yes'

# Rename columns
dict_columns_FI = {'Name': 'name',
                   'Company': 'company',
                   'Type': 'type',
                   'Address': 'street',
                   'Town': 'city',
                   'Postal code': 'postcode',
                   'Maximum, total, MW': 'capacity',
                   'Main fuel': 'energy_source',
                   'Country': 'country',
                   'Source': 'source'}
data_FI.rename(columns=dict_columns_FI, inplace=True)

# Apply general template of columns
data_FI = data_FI.reindex(columns=columns_sorted)

# Adjust types of entries in all columns
data_FI.capacity = data_FI.capacity.astype(float)

3.8.3 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_sources". Generation of entries for the column "energy_scoures" according to information given in the column "type".

In [ ]:
# Rename types of energy sources
dict_energysources_FI = {'Biogas': 'Biomass and biogas',
                         'Black liquor and concentrated liquors': 'Biomass and biogas',
                         'Blast furnace gas': 'Other fossil fuels',
                         'By-products from wood processing industry': 'Biomass and biogas',
                         'Exothermic heat from industry': 'Other or unspecified energy sources',
                         'Forest fuelwood': 'Biomass and biogas',
                         'Gasified waste': 'Non-renewable waste',
                         'Hard coal and anthracite': 'Hard coal',
                         'Heavy distillates': 'Oil',
                         'Industrial wood residues': 'Biomass and biogas',
                         'Light distillates': 'Oil',
                         'Medium heavy distillates': 'Oil',
                         'Mixed fuels': 'Mixed fossil fuels',
                         'Natural gas': 'Natural gas',
                         'Nuclear energy': 'Nuclear',
                         'Other by-products and wastes used as fuel': 'Other fossil fuels',
                         'Other non-specified energy sources': 'Other or unspecified energy sources',
                         'Peat': 'Biomass and biogas',
                         ' ': 'Other or unspecified energy sources',
                         np.nan: 'Other or unspecified energy sources'}
data_FI["energy_source"].replace(dict_energysources_FI, inplace=True)
data_FI["energy_source"].replace('NaN', np.nan, inplace=True)

# Generate entries in column "energy_sources" for hydro and wind stations according to column "type"
data_FI.loc[data_FI['type'] == 'Hydro power', 'energy_source'] = 'Hydro'
data_FI.loc[data_FI['type'] == 'Wind power', 'energy_source'] = 'Wind'

3.8.4 Definition of generation technology types

Generation of entries for the column "technology" according to information given in the column "energy_source". Deletion of rows containing "wind" as energy source.

In [ ]:
# Generate entries in column "technology" according to column "energy_source"
data_FI.loc[data_FI['energy_source'] == 'Nuclear',
            'technology'] = 'Steam turbine'
data_FI.loc[data_FI['energy_source'] == 'Hard coal',
            'technology'] = 'Steam turbine'

# Delete unwanted energy source (wind) in column "energy_source"
data_FI = data_FI[data_FI.energy_source != 'Wind']

3.8.5 Definition of generation type

Overall translation of all types mentioned in the column "type" and subsequent translation check.

In [ ]:
# Rename types
dict_types_FI = {'District heating CHP': 'CHP',
                 'Hydro power': 'NaN',
                 'Industry CHP': 'IPP',
                 'Nuclear energy': 'NaN',
                 'Separate electricity production': 'NaN',
                 'Wind power': 'NaN'}
data_FI["type"].replace(dict_types_FI, inplace=True)
data_FI["type"].replace('NaN', np.nan, inplace=True)

# drop solar generator (redundant)
data_FI = data_FI[data_FI.type != 'Solar']

3.8.6 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_FI = add_location_and_EIC('FI', data_FI)

# add source
data_FI["source"] = "https://energiavirasto.fi/toimitusvarmuus"

data_FI.head()

3.9 Spain ES

3.9.1 Data import

The data is provided by the Spanish SEDE - Ministry of Industry, Energy and Tourism. It encompasses a detailed list of Spanish generation units with comprehensive information on technologies and energy fuels.

In [ ]:
filepath_ES = os.path.join(data_directory, 'ES', 'Registro_16_12_2019.csv')
data_ES = pd.read_csv(filepath_ES, error_bad_lines=False, sep=';',decimal=',')  

data_ES.head()

3.9.2 Translation and harmonization of columns

Overall adjustment of all columns within the dataframe. Translation, addition, deletion, sorting of columns as well as adjustment of the column entries' types. Adjustment of the entries' units from kW to MW in the columns "Capacity" (corresponding to the net capacity in the original data set).

In [ ]:
# Select value of 'Potencia Instalada en MW' if gross capacity is empty
data_ES['Potencia Bruta Total en MW'] = np.where(data_ES['Potencia Bruta Total en MW'].isnull(),data_ES['Potencia Instalada en MW'],data_ES['Potencia Bruta Total en MW'])

# Rename columns
dict_columns_ES = {'Nombre del Titular de la Unidad de Producción': 'company',
                   'Nombre de la Unidad de Producción': 'name',
                   'Municipio de la Unidad de Producción': 'city',
                   'CPostal del Titular': 'postcode',
                   'Tecnología de la Unidad de Producción': 'technology',
                   'Comment': 'comment',
                   'Potencia Bruta Total en MW': 'capacity',
                   'Tipo de Unidad de Producción': 'energy_source',
                   'Fecha de la puesta en servicio de la Unidad de Producción': 'commissioned',
                   'Country': 'country',
                   'Source': 'source'}
data_ES.rename(columns=dict_columns_ES, inplace=True)

#Fix capacity entries to float
numeric_capacity = []
for cap in data_ES.capacity:
    if isinstance(cap, float):
        numeric_capacity.append(cap)
    else:
        split_cap = cap.split(',')
        if len(split_cap) == 1:
            numeric_capacity.append(int(split_cap[0]))
        elif len(split_cap) == 2:
            numeric_capacity.append(int(split_cap[0]) + float('.' + split_cap[1]))
        elif len(split_cap) == 3:
            numeric_capacity.append(int(split_cap[0])*1000 + int(split_cap[1]) + float('.' + split_cap[2]))
        else:
            numeric_capacity.append(np.nan)
   
data_ES["capacity"] = numeric_capacity  

# Apply general template of columns
data_ES = data_ES.reindex(columns=columns_sorted)

3.9.3 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_sources".

In [ ]:
dict_energysources_ES = {'Biocombustibles liquidos': 'Biomass and biogas',
                         'Biogas': 'Biomass and biogas',
                         'Biogas de digestion': 'Biomass and biogas',
                         'Biogas de vertedero': 'Biomass and biogas',
                         'Biomasa industrial agricola': 'Biomass and biogas',
                         'Biomasa industrial forestal': 'Biomass and biogas',
                         'Biomasa primaria': 'Biomass and biogas',
                         'Calor residual': 'Other or unspecified energy sources',
                         'Carbon': 'Hard coal',
                         'CARBON IMPORTADO': 'Hard coal',
                         'Cultivos energeticos agricolas o forestales': 'Biomass and biogas',
                         'DIESEL': 'Oil',
                         'Energias residuales': 'Non-renewable waste',
                         'Fuel': 'Oil',
                         'FUEL-OIL 0,3': 'Oil',
                         'FUELOLEO': 'Oil',
                         'GAS DE REFINERIA': 'Natural gas',
                         'Gas natural': 'Natural gas',
                         'GAS NATURAL': 'Natural gas',
                         'Gas residual': 'Natural gas',
                         'Gasoleo': 'Oil',
                         'GASOLEO': 'Oil',
                         'HULLA+ANTRACITA': 'Hard coal',
                         'Licores negros': 'Biomass and biogas',
                         'LIGNITO NEGRO': 'Lignite',
                         'LIGNITO PARDO': 'Lignite',
                         'NUCLEAR': 'Nuclear',
                         'Propano': 'Natural gas',
                         'Residuo aprovechamiento forestal o selvicola': 'Other bioenergy and renewable waste',
                         'Residuos': 'Non-renewable waste',
                         'Residuos actividad agricolas o jardineria': 'Other bioenergy and renewable waste',
                         'Residuos industriales': 'Non-renewable waste',
                         'Residuos solidos urbanos': 'Non-renewable waste',
                         'RESIDUOS SOLIDOS URBANOS': 'Non-renewable waste',
                         ' ': 'Other or unspecified energy sources',
                         np.nan: 'Other or unspecified energy sources',
                        'HIDRÁULICA': 'Hydro',
                        'TERMONUCLEAR': 'Nuclear',
                        'TÉRMICA': 'Hard coal',
                        'TÉRMICA CLÁSICA': 'Hard coal'}

data_ES["energy_source"].replace(dict_energysources_ES, inplace=True)
data_ES["energy_source"].replace('NaN', np.nan, inplace=True)

3.9.4 Definition of generation technology types

Overall translation of all technology types mentioned in the column "technology".

In [ ]:
# Translate  technologies
data_ES.loc[data_ES.technology == "COGENERACIÓN", "chp"] = "yes"

data_ES["technology"].replace('COGENERACIÓN','chp', inplace=True)

dict_technologies_ES = {'FLUYENTE': 'Run-of-river',
                        'EMBALSE': 'Reservoir',
                        'BOMBEO MIXTO': 'Pumped storage with natural inflow',
                        'CT CARBÓN': '',
                        'CN PWR': '',
                        'CN BWR': '',
                        'COGENERACIÓN': 'chp',
                        'Turbinas de Vapor de Fuel': '',
                        'CICLO COMBINADO': 'Combined cycle',
                        'Ciclo combinado configuración 2x1': 'Combined cycle',
                        'RESÍDUOS SÓLIDOS URBANOS': '',
                        'Turbinas de vapor de Carbón': 'Steam turbine',
                        'Ciclo combinado configuración 3x1': 'Combined cycle',
                        'Turbinas de gas aeroderivadas': 'Gas turbine',
                        'TURBINA DE GAS Y DE VAPOR': 'Gas turbine',
                        'BOMBEO PURO': 'Pumped storage',
                        'CT FUELOLEO': '',
                        'Grupos Diésel - 4T': '',
                        'MOTORES DIESEL': '',
                        'Turbinas de gas heavy duty': 'Gas turbine',
                        'Grupos Diésel - 2T': '',
                        'BOMBEO+ EOLICA': '',
                        'TURBINA DE GAS': 'Gas turbine',
                        'CT FUEL-GAS': '',
                        }
data_ES.loc[:, "technology"] = data_ES["technology"].replace(dict_technologies_ES)

data_ES.loc[data_ES.technology == "chp", "chp"] = "yes"

3.9.5 Delete unwanted energy sources/power stations with no names and adjust commissioning year

Exclude renewable energy sources, delete power stations with no names and capacities, and adjust the format of commissioning year.

In [ ]:
# Delete unwanted energy source in column "energy_source" & "technology"
data_ES = data_ES[data_ES.energy_source != 'TERMOELÉCTRICA']
data_ES = data_ES[data_ES.energy_source != 'COGENERACIÓN']
data_ES = data_ES[data_ES.energy_source != 'EXPERIMENTAL']
data_ES = data_ES[data_ES.name != '']

# Delete power stations with no name and no capacities
data_ES = data_ES[data_ES.name.notna()]

# Map commissioned year to Timestamp col
data_ES['commissioned_year'] = pd.to_datetime(data_ES['commissioned'], format='%d.%m.%Y %H:%M')
# Reassing commissioned col with year only
mask = data_ES['commissioned_year'].notna()
data_ES.loc[mask, 'commissioned'] = data_ES.loc[mask].commissioned_year.apply(lambda x: x.year)
# Drop not needed col
data_ES.drop('commissioned_year', axis=1, inplace=True)

3.9.6 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_ES = add_location_and_EIC('ES', data_ES)

# add source
data_ES["source"] = "https://sede.minetur.gob.es/en-US/datosabiertos/catalogo/registro-productores-electrica"

data_ES.head()

3.10 United Kingdom UK

3.10.1 Data import

The data is provided by the British government's Statistical Office. It encompasses a detailed list of British generation units with comprehensive information on technologies and energy fuels.

In [ ]:
filepath_UK = os.path.join(data_directory, 'UK', 'DUKES_5.11_UK.csv')
data_UK = pd.read_csv(filepath_UK, sep=';') 

data_UK.head()

3.10.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Rename sixth column
data_UK.columns.values[6] = 'Location'


# Rename columns
dict_columns_UK = {'Company Name': 'company',
                   'Station Name': 'name',
                   'Installed Capacity (MW)': 'capacity',
                   'Country': 'country',
                   'Location': 'location',
                   'Fuel': 'energy_source',
                   'Year of commission or year generation began': 'commissioned',
                   'Source': 'source'}
data_UK.rename(columns=dict_columns_UK, inplace=True)

        
dict_regions_UK = {'East': 'England',
                   'East Midlands': 'England',
                   'London': 'England',
                   'North East': 'England',
                   'North West': 'England',
                   'South East': 'England',
                   'South West': 'England',
                   'West Midlands': 'England',
                   'Yorkshire and the Humber': 'England',
                   'N Ireland': 'Northern Ireland'}
data_UK["location"]=data_UK["location"].replace(dict_regions_UK, inplace=True)

# Merge columns "Country" and "Location" to one column called "Country"
data_UK['additional_info'] = data_UK[['location']].apply(
    lambda x: 'Region: {}'.format(x[0]), axis=1)

# Drop column "Location" after merger
colsToDrop = ['location']
data_UK = data_UK.drop(colsToDrop, axis=1)

# Apply general template of columns
data_UK = data_UK.reindex(columns=columns_sorted)

#Set specific territory to "additional_info"
data_UK['additional_info'] = data_UK['country']

# Solve comma problem in capacity column and convert to float
data_UK.capacity = data_UK.capacity.str.replace(',', '').astype(float)

3.10.3 Definition of generation technology types

Generation of entries for the column "technology" according to information given in the column "energy_source".

In [ ]:
# Generate entries in column "technology" according to column "energy_source"
data_UK.loc[data_UK['energy_source'] == 'Hydro / pumped storage', 
            'technology'] = 'Pumped storage'
data_UK.loc[data_UK['energy_source'] == 'Pumped storage',
            'technology'] = 'Pumped storage'
data_UK.loc[data_UK['energy_source'] == 'Wind',
            'technology'] = 'Onshore'
data_UK.loc[data_UK['energy_source'] == 'Wind (offshore)',
            'technology'] = 'Offshore'
data_UK.loc[data_UK['energy_source'] == 'Nuclear',
            'technology'] = 'Steam turbine'
data_UK.loc[data_UK['energy_source'] == 'CCGT',
            'technology'] = 'Combined cycle'
data_UK.loc[data_UK['energy_source'] == 'OCGT',
            'technology'] = 'Gas turbine'

3.10.4 Definition of energy sources

Overall translation of all energy sources types mentioned in the column "energy_source" and subsequent translation check. Deletion of rows containing "wind" as energy source.

In [ ]:
dict_energysources_UK = {'Biomass': 'Biomass and biogas',
                         'Biomass / gas / waste derived fuel': 'Mixed fossil fuels',
                         'Natural Gas': 'Natural gas',
                         'CCGT': 'Natural gas',
                         'Sour gas': 'Natural gas',
                         'Coal': 'Hard coal',
                         'Coal / biomass': 'Mixed fossil fuels',
                         'Coal / biomass / gas / waste derived fuel': 'Mixed fossil fuels',
                         'Coal / oil': 'Mixed fossil fuels',
                         'Coal/oil': 'Mixed fossil fuels',
                         'Diesel': 'Oil',
                         'Gas': 'Natural gas',
                         'Gas / oil': 'Mixed fossil fuels',
                         'Diesel/gas oil': 'Mixed fossil fuels',
                         'Gas oil': 'Oil',
                         'Gas oil / kerosene': 'Oil',
                         'Hydro': 'Hydro',
                         'Hydro / pumped storage': 'Hydro',
                         'Pumped Storage': 'Hydro',
                         'Light oil': 'Oil',
                         'Meat & bone meal': 'Other bioenergy and renewable waste',
                         'Nuclear': 'Nuclear',
                         'OCGT': 'Natural gas',
                         'Oil': 'Oil',
                         'Light oil ': 'Oil',
                         'Pumped storage': 'Hydro',
                         'Straw': 'Biomass and biogas',
                         'Biomass (wood pellets, sunflower/oat husk pellets)': 'Biomass and biogas',
                         'Biomass (woodchip)': 'Biomass and biogas',
                         'Biomass (litter, woodchip)': 'Biomass and biogas',
                         'Biomass (meat and bone meal)': 'Biomass and biogas',
                         'Biomass (poultry litter, waste wood)': 'Biomass and biogas',
                         'Biomass (straw)': 'Biomass and biogas',
                         'Biomass (recycled wood)': 'Biomass and biogas',
                         'Biomass (poultry litter, woodchip)': 'Biomass and biogas',
                         'Biomass (wood pellets)': 'Biomass and biogas',
                         'Waste (municipal solid waste)': 'Non-renewable waste',
                         'Biomass (recycled wood, virgin wood)': 'Biomass and biogas',
                         'Biomass (virgin wood)': 'Biomass and biogas',
                         'Waste': 'Non-renewable waste',
                         'Waste (anaerobic digestion)': 'Non-renewable waste',
                         'Wind': 'Wind',
                         'Wind (offshore)': 'Wind',
                         'Wind (onshore)': 'Wind',
                         'Solar': 'Solar'}
data_UK["energy_source"].replace(dict_energysources_UK, inplace=True)

# Delete unwanted energy sources
data_UK = data_UK[data_UK.energy_source != 'Wind']
data_UK = data_UK[data_UK.energy_source != 'Solar']

3.10.5 Additional information on geographic coordinates and EIC codes

In this section a manually compiled list is used to define the geographic coordinates of indivdual power plants.

In [ ]:
data_UK = add_location_and_EIC('UK', data_UK)

# add source
data_UK["source"] = "https://www.gov.uk/government/statistics/electricity-chapter-5-digest-of-united-kingdom-energy-statistics-dukes#content"

data_UK.head()

3.11 Norway NO

The data is provided by the Norwegian Water Resources and Energy Directorate. It encompasses a database on the installed capacity of thermal generators as well as on hydro generators.

3.11.1 Data import

In [ ]:
filepath_NO_hydro = os.path.join(data_directory, 'NO', 'Vannkraftverk.csv')
data_NO_hydro = pd.read_csv(filepath_NO_hydro, 
                skiprows=2,
                sep=";",
                decimal=",",
                header=0,
                index_col=False,
                encoding='latin-1')
In [ ]:
filepath_NO_thermal = os.path.join(data_directory, 'NO', 'termiske-kraftverk-i-norge-2019.xlsx')
data_NO_thermal = pd.read_excel(filepath_NO_thermal,
                sheet_name='Ark1')

3.11.2 Hydro generators

Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_NO_hydro = {'Navn': 'name',
                         'Type': 'technology',
                         'Kommune': 'city',
                         'Kommunenr.': 'postcode',
                         'Maks ytelse [MW]': 'capacity',
                         'Hovedeier': 'company',
                         'Dato for første utnyttelse av fallet': 'commissioned',
                         'Elspotområde': 'additional_info'}
data_NO_hydro.rename(columns=dict_columns_NO_hydro, inplace=True)

# Apply general template of columns
data_NO_hydro = data_NO_hydro.reindex(columns=columns_sorted)

# Fill with general information
data_NO_hydro['country'] = 'NO'
data_NO_hydro['energy_source'] = 'Hydro'
data_NO_hydro['additional_info'] = 'Zone: NO' + data_NO_hydro['additional_info'].astype(str)
# Change comissioning date to year only
data_NO_hydro['commissioned'] = data_NO_hydro['commissioned'].apply(lambda x: x[0:4]).astype(int)

Definition of energy sources

Overall translation of all technology types mentioned in the column "technology".

In [ ]:
# Add comment for plants with pump only (later categorized as pumped storage)
data_NO_hydro.loc[data_NO_hydro['technology'] == 'Pumpe', 'comment'] = 'Pump only'
# Take absolute of negative capacity of plants with pump only
mask = data_NO_hydro['technology'] == 'Pumpe'
data_NO_hydro.loc[mask, 'capacity'] = data_NO_hydro.loc[mask].capacity.apply(lambda x: abs(x))

# Translate technologies
dict_technologies_NO_hydro = {
    'Kraftverk': 'Reservoir',
    'Pumpekraftverk': 'Pumped storage',
    'Pumpe': 'Pumped storage'
    }
data_NO_hydro['technology'].replace(dict_technologies_NO_hydro, inplace=True)

data_NO_hydro.head()

3.11.3 Thermal generators

Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
# Translate columns
dict_columns_NO_thermal = {'Kraftverk': 'name',
                         'Kommentar': 'comment',
                         'Brensel': 'energy_source',
                         'Kommune': 'city',
                         'Kommunenr.': 'postcode',
                         'Installert effekt [MW]': 'capacity',
                         'Idriftsettelsesår': 'commissioned',
                         'Elspotområde': 'additional_info'}
data_NO_thermal.rename(columns=dict_columns_NO_thermal, inplace=True)

# Apply general template of columns
data_NO_thermal = data_NO_thermal.reindex(columns=columns_sorted)

# Fill with general information
data_NO_thermal['country'] = 'NO'
data_NO_thermal['additional_info'] = 'Zone: ' + data_NO_thermal['additional_info'].astype(str)

Definition of energy sources

Overall translation of all technology types mentioned in the column "technology".

In [ ]:
# Translate energy sources
dict_energy_sources_NO_thermal = {
    'Avfallsforbrenning': 'Non-renewable waste',
    'Varmegjenvinning': 'Other fossil fuels',
    'Naturgass': 'Natural gas',
    'Biogass fra avfall': 'Biomass and biogas',
    'Ukjent': np.nan,
    'Bark, returfiberavfall, slam, rivningsvirke og olje ': 'Mixed fossil fuels',
    'Flis fra impregnert tre, avfallsforbrenning': 'Other bioenergy and renewable waste',
    'Biogass': 'Biomass and biogas',
    'CO gass': 'Other fossil fuels'
    }
data_NO_thermal['energy_source'].replace(dict_energy_sources_NO_thermal,
                                         inplace=True)

# Delete unwanted energy sources
data_NO_thermal = data_NO_thermal[data_NO_thermal.energy_source != 'Biomass and biogas']
data_NO_thermal = data_NO_thermal[data_NO_thermal.energy_source != 'Other bioenergy and renewable waste']

data_NO_thermal.head()

3.11.4 Combine hydro and thermal data frames

In [ ]:
# add sources
data_NO_hydro["source"] = "https://www.nve.no/energiforsyning/kraftproduksjon/vannkraft/vannkraftdatabase/#"
data_NO_thermal["source"] = "https://www.nve.no/media/8967/termiske-kraftverk-i-norge-2019.xlsx"

data_NO = pd.concat([data_NO_hydro, data_NO_thermal], ignore_index=True)

data_NO.head()

3.12 Sweden SE

The data is provided by the power exchange Nordpool. It encompasses a detailed list of Swedish generation units with a capacity of more than 100 MW for 2014. Since there was no new data on the Swedish generators found, the list from 2014 is still used in this release.

In [ ]:
filepath_SE = os.path.join(data_directory, 'SE', 'input_plant-list_SE.csv')
data_SE = pd.read_csv(filepath_SE, encoding='utf-8', header=0, index_col=None)

data_SE.head()

3.13 Slovakia SK

The data is provided by the Slovakian utility Slovenské elektrárne a.s. (SEAS). It encompasses a detailed list of Slovak generation units with comprehensive information on technologies and energy fuels.

In [ ]:
filepath_SK = os.path.join(data_directory, 'SK', 'input_plant-list_SK.csv')
data_SK = pd.read_csv(filepath_SK, encoding='utf-8', header=0, index_col=None) 

data_SK.head()

3.14 Slovenia SI

The data is provided by several Slovenian utilities. The respective data links are given in the column "source". This list encompasses Slovenian generation units with comprehensive information on technologies and energy fuels.

In [ ]:
filepath_SI = os.path.join(data_directory, 'SI', 'input_plant-list_SI.csv')
data_SI = pd.read_csv(filepath_SI, encoding='utf-8')

data_SI.head()

3.15 Austria AT

The data for conventional power plants is provided by several Austrian utilities. The respective data links are given in the column "source". The specifications of Austrian hydro power plants, however, solely are based on Verbund AG. The resulting list encompasses Austrian generation units with comprehensive information on technologies and energy fuels.

3.15.1 Data import

In [ ]:
filepath_AT_hydro = os.path.join(data_directory, 'AT', 'input_plant-list_AT_hydro.csv')
data_AT_hydro = pd.read_csv(filepath_AT_hydro, encoding="latin1")
In [ ]:
filepath_AT_thermal = os.path.join(data_directory, 'AT', 'input_plant-list_AT_thermal.csv')
data_AT_thermal = pd.read_csv(filepath_AT_thermal, encoding="latin1")

3.15.2 Translation and harmonization of columns

The imported data is standardized with respect to the columns as defined in section 2.3. In a first step, existing and output-relevant columns are translated and remaining columns are deleted in a second step. Columns which are not exist in the data set, but required for the output are additionally added in this process.

In [ ]:
#Delete MW in capacity column
data_AT_hydro.capacity = data_AT_hydro.capacity.apply(lambda x: x.replace('MW',''))

#Apply general template of columns
data_AT_hydro = data_AT_hydro.reindex(columns=columns_sorted)
data_AT_thermal = data_AT_thermal.reindex(columns=columns_sorted)

3.15.3 Combine hydro and thermal data frames

In [ ]:
data_AT = pd.concat([data_AT_hydro, data_AT_thermal], ignore_index=True)

data_AT.head()

3.16 Denmark DK

The data is assembled using the information of several websites. The sources can be found within the document. It encompasses a detailed list of Danish generation units with comprehensive information on technologies and energy fuels.

3.16.1 Data import

In [ ]:
filepath_DK = os.path.join(data_directory, 'DK', 'input_plant-list_DK.csv')
data_DK = pd.read_csv(filepath_DK, encoding='utf-8', header=0, index_col=None)

3.17.2 Translation and harmonization of columns

All generators, which are not or only partly available, are dropped. The imported data is then standardized with respect to the columns as defined in section 2.3.

In [ ]:
# List only operating plants
data_DK = data_DK[data_DK.availability != '0']
data_DK = data_DK[data_DK.availability != 'partly']

#Drop unwanted columns
data_DK = data_DK.drop('availability', axis=1)

# Apply general template of columns
data_DK=data_DK.reindex(columns=columns_sorted)

data_DK.head()

4. Consolidation of processed country data

In the following, the national datasets are consolidated to a single European dataset. Unfortunately, the Belgian dataset cannot be integrated due to the copyright by the data owner ELIA.

In [ ]:
dataframes = [data_BE,
              data_NL,
              data_FR,
              data_PL,
              data_CZ,
              data_CH,
              data_IT,
              data_FI,
              data_ES,
              data_UK,
              data_NO,
              data_SE,
              data_SK,
              data_SI,
              data_AT,
              data_DK]

data_EU = pd.concat(dataframes, sort=False)

data_EU.head()

4.1 Implementation of energy source levels

In [ ]:
# Import energy source level definition
energy_source_levels = pd.read_csv(
    os.path.join('input', 'energy_source_levels.csv'), index_col=None, header=0)

# Merge energy source levels to data set
data_EU = data_EU.reset_index().merge(
    energy_source_levels,
    how='left',
    left_on='energy_source',
    right_on='energy_source_level_1').drop_duplicates(
        subset=['name',
                'city',
                'country',
                'capacity'], keep='first').set_index('name')

data_EU = data_EU.reset_index().merge(
    energy_source_levels,
    how='left',
    left_on='energy_source',
    right_on='energy_source_level_2').drop_duplicates(
        subset=['name',
                'city',
                'country',
                'capacity'], keep='first').set_index('name')

data_EU = data_EU.reset_index().merge(
    energy_source_levels,
    how='left',
    left_on='energy_source',
    right_on='energy_source_level_3').drop_duplicates(
        subset=['name',
                'city',
                'country',
                'capacity'], keep='first').set_index('name')

# Combine different energy source levels created by merge
data_EU['energy_source_level_1'] = data_EU[
    ['energy_source_level_1',
     'energy_source_level_1_x',
     'energy_source_level_1_y']].fillna('').sum(axis=1)

data_EU['energy_source_level_2'] = data_EU[
    ['energy_source_level_2',
     'energy_source_level_2_y']].fillna('').sum(axis=1)

data_EU['energy_source_level_3'] = data_EU[
    ['energy_source_level_3']].fillna('').sum(axis=1)

# Drop auxiliary columns due to merge
colsToDrop = ['energy_source_level_1_y',
              'energy_source_level_2_y',
              'energy_source_level_3_y',
              'energy_source_level_1_x',
              'energy_source_level_2_x',
              'energy_source_level_3_x']
data_EU = data_EU.drop(colsToDrop, axis=1)

# replace false energy source levels for plants without energy source in original data
index_with_NAN = data_EU[data_EU.energy_source.isna()].index

data_EU.loc[index_with_NAN,['energy_source_level_1', 
                            'energy_source_level_2', 
                            'energy_source_level_3']] = np.NaN

data_EU.loc[index_with_NAN, 'comment'] = 'Energy source not in original data'


data_EU.head()

4.2 Definition of structure and data types

First, we define the ordering of the columns. Secondly, the data types are redefined. At the moment, this has the drawback that empty columns are redefined as float instead of object.

In [ ]:
columns_sorted_output = ['name',
                         'company',
                         'street',
                         'postcode',
                         'city',
                         'country',
                         'capacity',
                         'energy_source',
                         'technology',
                         'chp',
                         'commissioned',
                         'type',
                         'lat',
                         'lon',
                         'eic_code',
                         'energy_source_level_1',
                         'energy_source_level_2',
                         'energy_source_level_3',
                         'additional_info',
                         'comment',
                         'source']

# Set ordering of columns
data_EU = data_EU.reset_index()
data_EU = data_EU.reindex(columns=columns_sorted_output)

# Set data types for columns
data_EU = data_EU.astype(str)
data_EU[['capacity', 'commissioned', 'lat', 'lon']] = data_EU[
    ['capacity', 'commissioned', 'lat', 'lon']].astype(float)

data_EU.replace('nan', np.nan, inplace=True)

# data_EU.dtypes

# Set index
data_EU = data_EU.set_index('name')

data_EU.head()

5. Result export

Obtain the DE list and concatinate. Note, the stand alone DE list contains more information.

In [ ]:
# call DE script if not already executed
# %run ./download_and_process_DE.ipynb
In [ ]:
data_DE = pd.read_csv('output/conventional_power_plants_DE.csv', index_col=0)

data_DE = data_DE.rename(columns={'eic_code_plant': 'eic_code',
                                  'capacity_net_bnetza': 'capacity'})
data_DE["source"] = "BNetzA/UBA"
data_DE["additional_info"] = ""

data_EU = pd.concat([data_EU, data_DE.loc[:, data_EU.columns]])

Write the final list to file.

In [ ]:
output_path = 'output'

data_EU.to_csv(os.path.join(
    output_path, 'conventional_power_plants_EU.csv'),
    encoding='utf-8',
    index_label='name')

data_EU.to_excel(
    os.path.join(output_path, 'conventional_power_plants_EU.xlsx'),
    sheet_name='plants',
    index_label='name')

data_EU.to_sql(
    'conventional_power_plants_EU',
    sqlite3.connect(os.path.join(output_path, 'conventional_power_plants.sqlite')),
    if_exists="replace",
    index_label='name')

End of script.