%run Requirements.ipynb
%run Country_ISO_Codes.ipynb
repo & folder name: GreenhouseData Requirements imported! Imported COUNTRY_DICT & CODE_GENERATOR_ISO3
All of these inventories have been homogenized (as much as possible/recommended) by making sure they all include the following characteristics:
Furthermore, thank you to the following people:
INFO:
print( ">>>>>>> BP <<<<<<<<")
# download file
# link_bp = "https://www.bp.com/content/dam/bp/business-sites/en/global/corporate/xlsx/energy-economics/statistical-review/bp-stats-review-2021-all-data.xlsx"
# resp = requests.get(link_bp)
# output = open('raw_data/BP/bp-stats-review-2021-all-data.xlsx', "wb")
# output.write(resp.content)
# output.close()
#read file
df_bp = pd.read_excel(io="../raw_data/BP/bp-stats-review-2021-all-data.xlsx",sheet_name=4,header=2)
#clean
df_bp = df_bp[df_bp[2020].notnull()]
df_bp = df_bp.rename(columns = {"Million tonnes of carbon dioxide":"name"})
df_bp = df_bp.drop(["2020.1","2009-19","2020.2"], axis=1)#deleting last three columns
df_bp.loc[:,1965:] = df_bp.loc[:,1965:].astype(float)
#add unit
df_bp.insert(1, "unit", "MtCO2")
#reindex
df_bp = df_bp.reset_index().drop("index", axis=1)
# CODE_GENERATOR_ISO3
df_bp.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_bp, show_missing=True,show_duplicates_nan=True)
# -----------------------------------------------------------------------------------------------------
#global?
df_bp_global = df_bp[df_bp.name=="Total World"].iloc[0,:]
df_bp_global.to_csv("../clean_data/global_subsets/BP_global.csv")
# ------------------------------------------------------------------------------------------------------------------
df_bp.to_csv("../clean_data/BP_global.csv", index=None)
df_bp = pd.read_csv("../clean_data/BP.csv")
df_bp
>>>>>>> BP <<<<<<<< PATCH APPLIED >>> 23 Missing codes (NaN) in this df ['Total North America', 'Central America', 'Other Caribbean', 'Other South America', 'Total S. & Cent. America', 'Other Europe', 'Total Europe', 'Other CIS', 'Total CIS', 'Other Middle East', 'Total Middle East', 'Eastern Africa', 'Middle Africa', 'Western Africa', 'Other Northern Africa', 'Other Southern Africa', 'Total Africa', 'Other Asia Pacific', 'Total Asia Pacific', 'Total World', 'of which: OECD', ' Non-OECD', ' European Union #'] >>> 0 DUPLICATES:
code | name |
---|
code | name | unit | 1965 | 1966 | 1967 | 1968 | 1969 | 1970 | 1971 | ... | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CAN | Canada | MtCO2 | 260.328494 | 271.695135 | 285.543065 | 308.282174 | 320.455802 | 344.911566 | 351.587904 | ... | 554.684502 | 551.136751 | 564.581949 | 571.809431 | 570.155125 | 553.250422 | 565.912760 | 576.210989 | 577.997163 | 517.656723 |
1 | MEX | Mexico | MtCO2 | 62.133329 | 65.055673 | 66.647498 | 72.170911 | 79.106716 | 84.172180 | 89.461650 | ... | 472.973143 | 476.710063 | 483.185138 | 471.180127 | 475.225283 | 480.443636 | 486.075949 | 477.118539 | 459.758515 | 373.215914 |
2 | USA | US | MtCO2 | 3451.891855 | 3639.795004 | 3738.161319 | 3947.343459 | 4117.396247 | 4271.528132 | 4309.068601 | ... | 5348.439351 | 5101.537959 | 5268.307567 | 5277.560073 | 5165.570150 | 5060.806233 | 5003.178942 | 5166.031776 | 5029.389363 | 4457.219800 |
3 | NaN | Total North America | MtCO2 | 3774.353678 | 3976.545812 | 4090.351882 | 4327.796544 | 4516.958764 | 4700.611877 | 4750.118155 | ... | 6376.096995 | 6129.384774 | 6316.074655 | 6320.549632 | 6210.950558 | 6094.500291 | 6055.167650 | 6219.361305 | 6067.145041 | 5348.092438 |
4 | ARG | Argentina | MtCO2 | 82.175120 | 84.526808 | 86.852525 | 89.324243 | 92.183564 | 85.895493 | 90.804629 | ... | 174.763500 | 183.748817 | 189.430178 | 189.544434 | 192.767155 | 191.450785 | 189.685376 | 187.460685 | 175.817913 | 164.133062 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
97 | NaN | Total Asia Pacific | MtCO2 | 1431.313107 | 1549.491516 | 1615.940543 | 1721.616694 | 1958.204612 | 2234.981320 | 2445.576141 | ... | 14869.453290 | 15309.823063 | 15666.971249 | 15850.095276 | 15988.325151 | 16148.466989 | 16501.051245 | 16917.645565 | 17203.330308 | 16812.478080 |
98 | NaN | Total World | MtCO2 | 11189.710673 | 11694.894484 | 12055.547104 | 12701.480490 | 13483.695285 | 14291.709386 | 14762.395588 | ... | 32172.508226 | 32503.997053 | 33071.152064 | 33140.667330 | 33206.145249 | 33361.880092 | 33726.868179 | 34351.098958 | 34356.611795 | 32318.644854 |
99 | NaN | of which: OECD | MtCO2 | 7701.282862 | 8003.175822 | 8270.866419 | 8782.764892 | 9306.691008 | 9791.451164 | 9934.096882 | ... | 12857.597343 | 12667.868636 | 12767.240569 | 12553.922901 | 12473.073827 | 12377.713337 | 12396.274549 | 12494.836120 | 12140.099652 | 10778.102600 |
100 | NaN | Non-OECD | MtCO2 | 3488.427811 | 3691.718662 | 3784.680685 | 3918.715598 | 4177.004277 | 4500.258222 | 4828.298706 | ... | 19314.910883 | 19836.128418 | 20303.911495 | 20586.744430 | 20733.071422 | 20984.166754 | 21330.593631 | 21856.262837 | 22216.512143 | 21540.542254 |
101 | NaN | European Union # | MtCO2 | 2616.385867 | 2663.180253 | 2742.047705 | 2923.833370 | 3148.076867 | 3332.435674 | 3409.412017 | ... | 3300.253085 | 3218.738884 | 3147.046820 | 2982.410543 | 3045.619428 | 3077.175821 | 3115.282824 | 3070.528755 | 2936.705787 | 2550.939074 |
102 rows × 59 columns
INFO:
NOTES:
print( ">>>>>>> CDIAC <<<<<<<<")
#download file
link_cdiac = "http://cdiac.ess-dive.lbl.gov/ftp/ndp030/CSV-FILES/nation.1751_2014.csv"
resp = requests.get(link_cdiac)
output = open('../raw_data/CDIAC/nation.1751_2014.csv',"wb")
output.write(resp.content)
output.close()
#read file
df_cdiac = pd.read_csv("../raw_data/CDIAC/nation.1751_2014.csv")
# #clean
df_cdiac = df_cdiac.drop([0,1,2])
df_cdiac = df_cdiac.rename(columns = {"Nation":"name"})
df_cdiac = df_cdiac.replace(".",np.nan) #remove empty cells marked with . to NaN or 0? NaN
df_cdiac.iloc[:,2:] = df_cdiac.iloc[:,2:].astype(float, errors="raise")
#CONVERT Carbon units to Carbon Dioxide units
convert_C_to_CO2=44/12
df_cdiac.iloc[:,2:] = df_cdiac.iloc[:,2:] * convert_C_to_CO2 / 1000
#convert years to integers?
df_cdiac.Year = df_cdiac.Year.astype(int)
#add unit (and change column name, unit multiplied by 1000)
df_cdiac.insert(2, "unit", "MtCO2")
df_cdiac = df_cdiac.rename(columns={"Total CO2 emissions from fossil-fuels and cement production (thousand metric tons of C)":"Total CO2 emissions from fossil-fuels and cement production (MtCO2)",
"Per capita CO2 emissions (metric tons of carbon)": "Per capita CO2 emissions (tCO2)"})
# reindex
df_cdiac = df_cdiac.reset_index().drop("index", axis=1)
# CODE_GENERATOR_ISO3
df_cdiac.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_cdiac, show_missing=True,show_duplicates_nan=True)
# global
df_cdiac_global = df_cdiac.groupby(["Year","unit"]).sum().reset_index()
df_cdiac_global.to_csv("../clean_data/global_subsets/CDIAC_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_cdiac.to_csv("../clean_data/CDIAC.csv", index=None)
df_cdiac = pd.read_csv("../clean_data/CDIAC.csv")
df_cdiac_global
>>>>>>> CDIAC <<<<<<<< >>> RHODESIA-NYASALAND made a list of (['MWI', 'ZWE']) but is now MWI (first item) >>> ST. KITTS-NEVIS-ANGUILLA made a list of (['AIA', 'KNA']) but is now AIA (first item) PATCH APPLIED >>> 37 Missing codes (NaN) in this df ['ANTARCTIC FISHERIES', 'CZECHOSLOVAKIA', 'DEMOCRATIC REPUBLIC OF VIETNAM', 'EAST & WEST PAKISTAN', 'FEDERAL REPUBLIC OF GERMANY', 'FEDERATION OF MALAYA-SINGAPORE', 'FORMER DEMOCRATIC YEMEN', 'FORMER GERMAN DEMOCRATIC REPUBLIC', 'FORMER PANAMA CANAL ZONE', 'FORMER YEMEN', 'FRENCH EQUATORIAL AFRICA', 'FRENCH INDO-CHINA', 'FRENCH WEST AFRICA', 'GUADELOUPE', 'JAPAN (EXCLUDING THE RUYUKU ISLANDS)', 'KUWAITI OIL FIRES', 'LEEWARD ISLANDS', 'MARTINIQUE', 'NETHERLAND ANTILLES', 'NETHERLAND ANTILLES AND ARUBA', 'PACIFIC ISLANDS (PALAU)', 'PENINSULAR MALAYSIA', 'REPUBLIC OF SOUTH VIETNAM', 'REPUBLIC OF SUDAN', 'RHODESIA-NYASALAND', 'RWANDA-URUNDI', 'RYUKYU ISLANDS', 'SABAH', 'SARAWAK', 'ST. KITTS-NEVIS-ANGUILLA', 'SUDAN', 'TANGANYIKA', 'UNITED KOREA', 'USSR', 'YUGOSLAVIA (FORMER SOCIALIST FEDERAL REPUBLIC)', 'YUGOSLAVIA (MONTENEGRO & SERBIA)', 'ZANZIBAR'] >>> 0 DUPLICATES:
code | name |
---|
Year | unit | Total CO2 emissions from fossil-fuels and cement production (MtCO2) | Emissions from solid fuel consumption | Emissions from liquid fuel consumption | Emissions from gas fuel consumption | Emissions from cement production | Emissions from gas flaring | Per capita CO2 emissions (tCO2) | Emissions from bunker fuels (not included in the totals) | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1751 | MtCO2 | 9.357333 | 9.357333 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
1 | 1752 | MtCO2 | 9.361000 | 9.361000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
2 | 1753 | MtCO2 | 9.361000 | 9.361000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
3 | 1754 | MtCO2 | 9.364667 | 9.364667 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
4 | 1755 | MtCO2 | 9.368333 | 9.368333 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
259 | 2010 | MtCO2 | 31905.100333 | 13772.781000 | 9991.120333 | 6260.899333 | 1634.831000 | 245.476000 | 1.119653 | 1118.076667 |
260 | 2011 | MtCO2 | 33156.196333 | 14752.862667 | 9956.906667 | 6400.636000 | 1810.537667 | 235.227667 | 1.110047 | 1145.705000 |
261 | 2012 | MtCO2 | 33767.770667 | 14974.094667 | 10144.544667 | 6507.746667 | 1904.147667 | 237.233333 | 1.141360 | 1091.170667 |
262 | 2013 | MtCO2 | 33860.959000 | 14777.051667 | 10184.390333 | 6619.715667 | 2029.511000 | 250.261000 | 1.124420 | 1120.885333 |
263 | 2014 | MtCO2 | 34122.733333 | 14778.734667 | 10385.591333 | 6628.457000 | 2082.091000 | 247.859333 | 1.129700 | 1130.917333 |
264 rows × 10 columns
INFO:
print( ">>>>>>> EIA <<<<<<<<")
#read file
df_eia = pd.read_csv("../raw_data/EIA/EIA_raw_processed_all.csv")
#clean
df_eia = df_eia.drop("Unnamed: 0", axis=1)
df_eia = df_eia.rename({'Unnamed: 1': 'name', 'Unnamed: 2': 'fuel'}, axis=1) # new method
df_eia = df_eia.replace("--",np.nan)
df_eia.loc[:,"1949":] = df_eia.loc[:,"1949":].astype(float)
df_eia.loc[:,"1949":].columns = df_eia.loc[:,"1949":].columns.astype(int)
df_eia.fuel = df_eia.fuel.replace({" CO2 emissions (MMtonnes CO2)":"Total CO2 emissions",
" Coal and coke (MMtonnes CO2)":"Coal and coke",
" Consumed natural gas (MMtonnes CO2)": "Consumed natural gas",
" Petroleum and other liquids (MMtonnes CO2)": "Petroleum and other liquids"})
#delete all empty unnecesary rows
for x in range(231):
equation1 = 6*x
equation2 = 6*x+1
df_eia = df_eia.drop([equation1])
df_eia = df_eia.drop([equation2])
# #add unit
df_eia.insert(2, "unit", "MtCO2")
# #reindex
df_eia = df_eia.reset_index().drop("index", axis=1)
# CODE_GENERATOR_ISO3
df_eia.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_eia, show_missing=True,show_duplicates_nan=True)
# #global?
year = 2018
df_eia_global = df_eia[df_eia.name=="World"]
df_eia_global.to_csv("../clean_data/global_subsets/EIA_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_eia.to_csv("../clean_data/EIA.csv", index=None)
df_eia = pd.read_csv("../clean_data/EIA.csv")
df_eia_global
>>>>>>> EIA <<<<<<<< PATCH APPLIED >>> 13 Missing codes (NaN) in this df ['World', 'Former Czechoslovakia', 'Former Serbia and Montenegro', 'Former U.S.S.R.', 'Former Yugoslavia', 'Germany, East', 'Germany, West', 'Hawaiian Trade Zone', 'Kosovo', 'Micronesia', 'Netherlands Antilles', 'U.S. Pacific Islands', 'U.S. Territories'] >>> 0 DUPLICATES:
code | name |
---|
code | name | fuel | unit | 1949 | 1950 | 1951 | 1952 | 1953 | 1954 | ... | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | World | Total CO2 emissions | MtCO2 | 2206.690829 | 2382.046176 | 2526.687327 | 2473.373964 | 2536.892888 | 2422.252560 | ... | 33652.592480 | 34786.404150 | 35607.55121 | 36013.90151 | 36046.193730 | 35855.855040 | 35278.502260 | 36072.334790 | 36903.369940 | NaN |
1 | NaN | World | Coal and coke | MtCO2 | 1117.538985 | 1151.600346 | 1166.858975 | 1052.386511 | 1057.069608 | 904.191799 | ... | 15035.601790 | 15850.888010 | 16325.70577 | 16505.52204 | 16392.746650 | 15844.416120 | 14956.562250 | 15365.386070 | 15783.059180 | 15953.29648 |
2 | NaN | World | Consumed natural gas | MtCO2 | 269.537229 | 312.826758 | 369.725933 | 396.144693 | 414.852766 | 437.259069 | ... | 6323.923968 | 6512.988366 | 6662.05692 | 6760.09430 | 6785.637172 | 6887.367588 | 7027.731893 | 7259.328867 | 7626.683356 | NaN |
3 | NaN | World | Petroleum and other liquids | MtCO2 | 819.614615 | 917.619072 | 990.102419 | 1024.842760 | 1064.970514 | 1080.801692 | ... | 12293.066720 | 12422.527770 | 12619.78852 | 12748.28517 | 12867.809900 | 13124.071330 | 13294.208110 | 13447.619860 | 13493.627400 | NaN |
4 rows × 75 columns
INFO:
NOTES:
print( ">>>>>>> IEA <<<<<<<<")
# read excel file
excel_df_dict = pd.read_excel("../raw_data/IEA/GHGHighlights.XLS",sheet_name=None,header=3)
# Cleaning 1: of the whole excel (wow), making individual dataframes
sheet_list = excel_df_dict.keys()
sheet_list = ['CONTENTS', 'GHG Energy', 'GHG Fugi', 'GHG FC', 'GHG FC - Coal', 'GHG FC - Oil',
'GHG FC - Gas', 'CO2 MARBUNK', 'CO2 AVBUNK', 'SECTOR', 'SECTOREH', 'CO2-TES',
'CO2-GDP', 'CO2-GDP PPP', 'CO2-POP', 'SECTPOP', 'KAYA', 'TIMEEXTENDED', 'GEO COVERAGE']
country_list = excel_df_dict["GHG FC"][20:189]["million tonnes of CO2 eq"].array
sheet_list_24line = ['GHG FC', 'GHG FC - Coal', 'GHG FC - Oil', 'GHG FC - Gas']
sheet_list_22line = ['GHG Energy', 'GHG Fugi','CO2 MARBUNK', 'CO2 AVBUNK', 'SECTOR', 'SECTOREH']
df_list24 = ["total_fuel_combustion", "coal", "oil", "gas"]
df_list22 = ["energy", "fugitive", "marine_bunkers", "int_aviation_bunkers", "sectors_2019", "sectors_alt_2019"]
for i,j in enumerate(sheet_list_24line):
# print(j)
df = excel_df_dict[j]
df = df.drop([0,2,10,12,17,18,19])
df = df.iloc[:-3,:]
df.insert(1,"type", "{}".format(df_list24[i]), allow_duplicates=True)
df.insert(2,"unit","MtCO2eq", allow_duplicates=True)
df = df.replace("..",np.nan)
df = df.replace("x",np.nan)
df.iloc[:,3:] = df.iloc[:,3:].astype(float)
df = df.rename(columns={"million tonnes of CO2 eq":"name"})
df = df[df.name.notna()]
df = df.reset_index().drop("index", axis=1)
df.to_csv("../raw_data/IEA/Cleaning_products/IEA_{}.csv".format(df_list24[i]))
# display(df)
for i,j in enumerate(sheet_list_22line):
# print(j)
df = excel_df_dict[j]
df = df.drop([0,2,10,12,15,16,17,])
df = df.iloc[:-3,:]
df.insert(1,"type", "{}".format(df_list22[i]), allow_duplicates=True)
df.insert(2,"unit","MtCO2eq", allow_duplicates=True)
df = df.replace("..",np.nan)
df = df.replace("x",np.nan)
df.iloc[:,3:] = df.iloc[:,3:].astype(float)
df = df.rename(columns={"kilo tonnes of CO2 eq":"name"}) #for fugitive!
df = df.rename(columns={"million tonnes of CO2":"name"}) #for bunkers
df = df.rename(columns={"million tonnes of CO2 eq":"name"})
df = df[df.name.notna()]
df = df.reset_index().drop("index", axis=1)
df.to_csv("../raw_data/IEA/Cleaning_products/IEA_{}.csv".format(df_list22[i]))
# display(df)
#Cleaning 2: Stitch everything together!
df_iea = pd.concat([
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_energy.csv"),
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_total_fuel_combustion.csv"),
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_coal.csv"),
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_oil.csv"),
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_gas.csv"),
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_marine_bunkers.csv"), #should i keep these last three?
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_int_aviation_bunkers.csv"), #should i keep these last three?
pd.read_csv("../raw_data/IEA/Cleaning_products/IEA_fugitive.csv"), #should i keep these last three?
])
#Cleaning 3: reset index and fix fugitive (/1000)
df_iea = df_iea.drop("Unnamed: 0", axis=1).reset_index().drop("index", axis=1)
df_iea.loc[df_iea.type=="fugitive", "1971":] = df_iea.loc[df_iea.type=="fugitive", "1971":]/1000
# CODE_GENERATOR_ISO3
df_iea.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_iea, show_missing=True,show_duplicates_nan=True)
#global
df_iea_global = df_iea[df_iea.name=='World']
df_iea_global.to_csv("../clean_data/global_subsets/IEA_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_iea.to_csv("../clean_data/IEA.csv", index=None)
df_iea = pd.read_csv("../clean_data/IEA.csv")
df_iea_global
>>>>>>> IEA <<<<<<<< >>> China (incl. Hong Kong, China) made a list of (['CHN', 'HKG']) but is now CHN (first item) PATCH APPLIED >>> 38 Missing codes (NaN) in this df ['World', 'Annex I Parties ', ' Annex II Parties ', ' North America ', ' Europe ', ' Asia Oceania', ' Annex I EIT ', 'Non-Annex I Parties ', 'Annex B Kyoto Parties', 'OECD Total', 'Non-OECD Total', 'OECD Americas', 'Korea', 'OECD Asia Oceania', 'OECD Europe', 'Kosovo', 'Former Soviet Union (if no detail)', 'Former Yugoslavia (if no detail)', 'Non-OECD Europe and Eurasia', 'Other Africa', 'Africa', 'Other Asia', 'Asia (excl. China)', 'China (incl. Hong Kong, China)', 'Other Non-OECD Americas', 'Non-OECD Americas', 'Middle East', 'IEA/Accession/Association', 'European Union - 27', 'G7', 'G8', 'G20', 'Americas', 'Asia', 'Europe', 'Oceania', 'International marine bunkers', 'International aviation bunkers'] >>> 0 DUPLICATES:
code | name |
---|
code | name | type | unit | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | ... | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | World | energy | MtCO2eq | 16120.824870 | 16840.204900 | 17770.991000 | 17722.368480 | 17767.906110 | 18761.579770 | ... | 35147.956940 | 35556.986810 | 36178.556740 | 36273.153730 | 36167.304110 | 36182.452960 | 36771.194710 | 37596.786600 | 37629.669290 | NaN |
181 | NaN | World | total_fuel_combustion | MtCO2eq | 14284.288490 | 14967.536480 | 15815.826960 | 15786.664360 | 15852.705180 | 16687.245120 | ... | 32034.068320 | 32384.933210 | 33000.345100 | 33060.846480 | 32962.678230 | 32986.788250 | 33534.777510 | 34248.773210 | 34233.900780 | NaN |
364 | NaN | World | coal | MtCO2eq | 5297.244895 | 5372.974981 | 5573.858828 | 5603.169177 | 5717.346241 | 5947.152890 | ... | 14685.682110 | 14817.326130 | 15167.459170 | 15193.048480 | 14718.631140 | 14474.800500 | 14717.956060 | 15082.576110 | 14891.538660 | NaN |
547 | NaN | World | oil | MtCO2eq | 6764.667370 | 7258.563584 | 7823.021307 | 7705.750431 | 7689.601665 | 8169.091201 | ... | 10732.520300 | 10828.834550 | 11000.550020 | 11038.235710 | 11311.187550 | 11379.850440 | 11523.130170 | 11540.009240 | 11541.869960 | NaN |
730 | NaN | World | gas | MtCO2eq | 2048.463652 | 2158.087347 | 2237.853518 | 2290.357773 | 2253.834552 | 2375.721994 | ... | 6146.725564 | 6263.048097 | 6349.277867 | 6342.486626 | 6447.273823 | 6625.569660 | 6775.139570 | 7098.052606 | 7267.203157 | NaN |
913 | NaN | World | marine_bunkers | MtCO2eq | 354.049019 | 370.588804 | 391.541383 | 365.896176 | 341.212700 | 351.851792 | ... | 662.463463 | 616.365331 | 614.885911 | 635.568373 | 659.841647 | 678.202256 | 701.873875 | 700.428265 | 684.843273 | 0.000000 |
1094 | NaN | World | int_aviation_bunkers | MtCO2eq | 169.300994 | 179.147485 | 186.892094 | 179.793336 | 173.866021 | 174.273795 | ... | 472.112407 | 472.721116 | 480.909125 | 496.627698 | 524.654032 | 548.207782 | 582.230489 | 609.630469 | 617.763240 | 0.000000 |
1275 | NaN | World | fugitive | MtCO2eq | 1836.536378 | 1872.668420 | 1955.164040 | 1935.704120 | 1915.200933 | 2074.334648 | ... | 3113.888620 | 3172.053598 | 3178.211640 | 3212.307248 | 3204.625880 | 3195.664710 | 3236.417200 | 3348.013385 | 3395.768510 | 3201.134588 |
8 rows × 54 columns
INFO:
NOTES:
print( ">>>>>>> GCP <<<<<<<<")
#read data
df_gcp = pd.read_excel("../raw_data/GCP/National_Carbon_Emissions_2021v0.4.xlsx",1, header=11, index_col=0)
df_gcp = df_gcp.T
#clean
df_gcp = df_gcp.reset_index().rename({"index":"name"}, axis=1)
#delete "statistical difference" row
df_gcp = df_gcp.drop(df_gcp.loc[df_gcp.name=="Statistical Difference"].index[0])
# convert from MtC to MtCO2
df_gcp.loc[:,1959:] = df_gcp.loc[:,1959:]*44/12
#add unit
df_gcp.insert(1, "unit", "MtCO2")
# CODE_GENERATOR_ISO3
df_gcp.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_gcp, show_missing=True,show_duplicates_nan=True)
#global
df_gcp_global = df_gcp[df_gcp.name=="World"]
df_gcp_global.to_csv("../clean_data/global_subsets/GCP_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_gcp.to_csv("../clean_data/GCP.csv", index=None)
df_gcp = pd.read_csv("../clean_data/GCP.csv")
df_gcp_global
>>>>>>> GCP <<<<<<<< PATCH APPLIED >>> 16 Missing codes (NaN) in this df ['Kosovo', 'KP Annex B', 'Non KP Annex B', 'OECD', 'Non-OECD', 'EU27', 'Africa', 'Asia', 'Central America', 'Europe', 'Middle East', 'North America', 'Oceania', 'South America', 'Bunkers', 'World'] >>> 0 DUPLICATES:
code | name |
---|
code | name | unit | 1959 | 1960 | 1961 | 1962 | 1963 | 1964 | 1965 | ... | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
234 | NaN | World | MtCO2 | 8862.634223 | 9393.7655 | 9424.316685 | 9757.152775 | 10279.000031 | 10836.738839 | 11325.469625 | ... | 34493.537255 | 34999.529349 | 35308.704501 | 35560.304992 | 35522.240696 | 35478.26167 | 35951.884512 | 36672.81092 | 36729.215078 | 34832.591912 |
1 rows × 65 columns
INFO:
NOTES:
print( ">>>>>>> GCP (consumption) <<<<<<<<")
#read data
df_gcp_cons = pd.read_excel("../raw_data/GCP/National_Carbon_Emissions_2021v0.4.xlsx",2, header=8, index_col=0)
df_gcp_cons = df_gcp_cons.T
#clean
df_gcp_cons = df_gcp_cons.reset_index().rename({"index":"name"}, axis=1)
#delete "statistical difference" row
df_gcp_cons = df_gcp_cons.drop(df_gcp_cons.loc[df_gcp_cons.name=="Statistical Difference"].index[0])
# convert from MtC to MtCO2
df_gcp_cons.loc[:,1959:] = df_gcp_cons.loc[:,1959:]*44/12
#add unit
df_gcp_cons.insert(1, "unit", "MtCO2")
# CODE_GENERATOR_ISO3
df_gcp_cons.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_gcp_cons, show_missing=True,show_duplicates_nan=True)
#-----------------------------------------------------------------------------------------------------------------------------
#global
df_gcp_cons_global = df_gcp_cons[df_gcp_cons.name=="World"]
df_gcp_cons_global.to_csv("../clean_data/global_subsets/GCP_consumption_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_gcp_cons.to_csv("../clean_data/GCP_consumption.csv", index=None)
df_gcp_cons = pd.read_csv("../clean_data/GCP_consumption.csv")
df_gcp_cons
>>>>>>> GCP (consumption) <<<<<<<< PATCH APPLIED >>> 16 Missing codes (NaN) in this df ['Kosovo', 'KP Annex B', 'Non KP Annex B', 'OECD', 'Non-OECD', 'EU27', 'Africa', 'Asia', 'Central America', 'Europe', 'Middle East', 'North America', 'Oceania', 'South America', 'Bunkers', 'World'] >>> 0 DUPLICATES:
code | name |
---|
code | name | unit | 1959 | 1960 | 1961 | 1962 | 1963 | 1964 | 1965 | ... | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AFG | Afghanistan | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | ALB | Albania | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 6.317225 | 5.995611 | 5.997769 | 6.506155 | 5.645737 | 5.505851 | 6.162990 | 5.810145 | 5.855507 | NaN |
2 | DZA | Algeria | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | AND | Andorra | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | AGO | Angola | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
229 | NaN | North America | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 7027.401592 | 6865.431285 | 6951.050600 | 6955.268498 | 6794.297883 | 6637.812920 | 6605.988370 | 6788.294674 | 6661.377993 | NaN |
230 | NaN | Oceania | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 431.361483 | 450.070196 | 426.921526 | 436.005669 | 436.373495 | 427.209621 | 427.051713 | 418.082814 | 415.848778 | NaN |
231 | NaN | South America | MtCO2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 1159.746666 | 1235.436210 | 1295.095385 | 1319.141711 | 1289.884130 | 1240.998543 | 1239.521487 | 1174.705417 | 1140.566653 | NaN |
232 | NaN | Bunkers | MtCO2 | 213.524667 | 238.292845 | 263.440589 | 273.141804 | 285.944088 | 313.257358 | 323.585427 | ... | 1112.771299 | 1102.538967 | 1118.908835 | 1129.513497 | 1172.553899 | 1186.190229 | 1232.700679 | 1266.183769 | 1258.682474 | 1004.963396 |
233 | NaN | World | MtCO2 | 8862.634223 | 9393.765500 | 9424.316685 | 9757.152775 | 10279.000031 | 10836.738839 | 11325.469625 | ... | 34493.537255 | 34999.529349 | 35308.704501 | 35560.304992 | 35522.240696 | 35478.261670 | 35951.884512 | 36672.810920 | 36729.215078 | 34832.591912 |
234 rows × 65 columns
INFO:
print( ">>>>>>> EPA <<<<<<")
# read file
df_epa = pd.read_excel("../raw_data/EPA/proj-data-annex_Sept2019.xlsx", 0)
#clean
df_epa = df_epa.rename(columns={"country":"name"})
df_epa = df_epa.drop("unit",axis=1)
df_epa.insert(1,"unit","MtCO2eq")
df_epa.value.sum()
# CODE_GENERATOR_ISO3
df_epa.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_epa, show_missing=True,show_duplicates_nan=True)
# #global - should I add it to the total dataframe?
df_epa_global = df_epa.groupby(["year", "unit","sector","source","subsource","gas"],dropna=False).sum().reset_index()
df_epa_global.insert(0, "name","World")
df_epa_global.insert(0, "code",np.nan)
df_epa_global.to_csv("../clean_data/global_subsets/EPA_global.csv")
print("\nGlobal non-CO2 emissions (2015): " + str(df_epa_global.groupby("year").sum().loc[2015,"value"]) + " MtCO2eq")
#------------------------------------------------------------------------------------------------------------------------------
df_epa.to_csv("../clean_data/EPA.csv", index=None)
df_epa = pd.read_csv("../clean_data/EPA.csv")
df_epa
>>>>>>> EPA <<<<<< PATCH APPLIED >>> 1 Missing codes (NaN) in this df ['Kosovo'] >>> 0 DUPLICATES:
code | name |
---|
Global non-CO2 emissions (2015): 12009.955309238629 MtCO2eq
code | name | unit | sector | source | subsource | gas | year | value | |
---|---|---|---|---|---|---|---|---|---|
0 | AFG | Afghanistan | MtCO2eq | Agriculture | AgSoils | NaN | N2O | 1990 | 4.433600 |
1 | AFG | Afghanistan | MtCO2eq | Agriculture | AgSoils | NaN | N2O | 1991 | 4.411430 |
2 | AFG | Afghanistan | MtCO2eq | Agriculture | AgSoils | NaN | N2O | 1992 | 4.389261 |
3 | AFG | Afghanistan | MtCO2eq | Agriculture | AgSoils | NaN | N2O | 1993 | 4.367091 |
4 | AFG | Afghanistan | MtCO2eq | Agriculture | AgSoils | NaN | N2O | 1994 | 4.344921 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
618535 | ZWE | Zimbabwe | MtCO2eq | Waste | Wastewater | Urban | N2O | 2048 | 0.122433 |
618536 | ZWE | Zimbabwe | MtCO2eq | Waste | Wastewater | Rural | N2O | 2049 | 0.165373 |
618537 | ZWE | Zimbabwe | MtCO2eq | Waste | Wastewater | Urban | N2O | 2049 | 0.125733 |
618538 | ZWE | Zimbabwe | MtCO2eq | Waste | Wastewater | Rural | N2O | 2050 | 0.166076 |
618539 | ZWE | Zimbabwe | MtCO2eq | Waste | Wastewater | Urban | N2O | 2050 | 0.129034 |
618540 rows × 9 columns
INFO:
print( ">>>>>>> FAOSTAT <<<<<<<<")
df_fao = pd.read_csv("../raw_data/FAO/Emissions_Totals_E_All_Data_NOFLAG.csv", encoding='latin-1')
#clean
df_fao.columns = df_fao.columns.str.replace("Y","")
df_fao = df_fao.rename(columns={"Area":"name"})
# some info
# print("Items ", df.groupby("Item").count().index)
# print("Elements ",df.groupby("Element").count().index)
# print("Source ",df.groupby("Source").count().index)
# print("Area ",df.groupby("Area").count().index)
# CODE_GENERATOR_ISO3
df_fao.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_fao, show_missing=True,show_duplicates_nan=True)
#global
df_fao_global = df_fao[df_fao.name=="World"]
df_fao_global.to_csv("../clean_data/global_subsets/FAO_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_fao.to_csv("../clean_data/FAO.csv", index=None)
df_fao = pd.read_csv("../clean_data/FAO.csv", low_memory=False)
df_fao_global
>>>>>>> FAOSTAT <<<<<<<< >>> Australia and New Zealand made a list of (['AUS', 'NZL']) but is now AUS (first item) PATCH APPLIED >>> 49 Missing codes (NaN) in this df ['Belgium-Luxembourg', 'Channel Islands', 'China, mainland', 'Czechoslovakia', 'Ethiopia PDR', 'French Guyana', 'Netherlands Antilles (former)', 'Pacific Islands Trust Territory', 'Serbia and Montenegro', 'Sudan (former)', 'USSR', 'Yugoslav SFR', 'World', 'Africa', 'Eastern Africa', 'Middle Africa', 'Northern Africa', 'Southern Africa', 'Western Africa', 'Americas', 'Northern America', 'Central America', 'Caribbean', 'South America', 'Asia', 'Central Asia', 'Eastern Asia', 'Southern Asia', 'South-eastern Asia', 'Western Asia', 'Europe', 'Eastern Europe', 'Northern Europe', 'Southern Europe', 'Western Europe', 'Oceania', 'Australia and New Zealand', 'Melanesia', 'Micronesia', 'Polynesia', 'European Union (27)', 'Least Developed Countries', 'Land Locked Developing Countries', 'Small Island Developing States', 'Low Income Food Deficit Countries', 'Net Food Importing Developing Countries', 'Annex I countries', 'Non-Annex I countries', 'OECD'] >>> 0 DUPLICATES:
code | name |
---|
code | Area Code | name | Item Code | Item | Element Code | Element | Source Code | Source | Unit | ... | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2030 | 2050 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31181 | NaN | 5000 | World | 5058 | Enteric Fermentation | 7225 | Emissions (CH4) | 3050 | FAO TIER 1 | kilotonnes | ... | 9.636930e+04 | 9.694660e+04 | 9.780942e+04 | 9.885354e+04 | 9.929557e+04 | 9.991874e+04 | 1.008342e+05 | NaN | 1.099799e+05 | 1.221324e+05 |
31182 | NaN | 5000 | World | 5058 | Enteric Fermentation | 7225 | Emissions (CH4) | 3051 | UNFCCC | kilotonnes | ... | 4.157057e+04 | 3.733662e+04 | 5.237421e+04 | 2.679629e+04 | 2.561591e+04 | 2.401546e+04 | 2.350277e+04 | NaN | NaN | NaN |
31183 | NaN | 5000 | World | 5058 | Enteric Fermentation | 724413 | Emissions (CO2eq) from CH4 (AR5) | 3050 | FAO TIER 1 | kilotonnes | ... | 2.698340e+06 | 2.714505e+06 | 2.738664e+06 | 2.767899e+06 | 2.780276e+06 | 2.797725e+06 | 2.823357e+06 | NaN | 3.079436e+06 | 3.419707e+06 |
31184 | NaN | 5000 | World | 5058 | Enteric Fermentation | 724413 | Emissions (CO2eq) from CH4 (AR5) | 3051 | UNFCCC | kilotonnes | ... | 1.162406e+06 | 1.043867e+06 | 1.460253e+06 | 7.263198e+05 | 6.854012e+05 | 6.724328e+05 | 6.580774e+05 | NaN | NaN | NaN |
31185 | NaN | 5000 | World | 5058 | Enteric Fermentation | 723113 | Emissions (CO2eq) (AR5) | 3050 | FAO TIER 1 | kilotonnes | ... | 2.698340e+06 | 2.714505e+06 | 2.738664e+06 | 2.767899e+06 | 2.780276e+06 | 2.797725e+06 | 2.823357e+06 | NaN | 3.079436e+06 | 3.419707e+06 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
31360 | NaN | 5000 | World | 6516 | Land Use change | 7230 | Emissions (N2O) | 3050 | FAO TIER 1 | kilotonnes | ... | 1.765170e+02 | 1.813805e+02 | 2.143186e+02 | 2.101779e+02 | 2.088701e+02 | 1.639089e+02 | 1.970951e+02 | NaN | NaN | NaN |
31361 | NaN | 5000 | World | 6516 | Land Use change | 7273 | Emissions (CO2) | 3050 | FAO TIER 1 | kilotonnes | ... | 3.451477e+06 | 3.712828e+06 | 3.736617e+06 | 2.999283e+06 | 2.964426e+06 | 3.107372e+06 | 3.275819e+06 | NaN | NaN | NaN |
31362 | NaN | 5000 | World | 6516 | Land Use change | 724413 | Emissions (CO2eq) from CH4 (AR5) | 3050 | FAO TIER 1 | kilotonnes | ... | 1.057169e+05 | 1.973947e+05 | 2.157555e+05 | 8.317803e+04 | 7.074291e+04 | 1.063424e+05 | 1.746944e+05 | NaN | NaN | NaN |
31363 | NaN | 5000 | World | 6516 | Land Use change | 724313 | Emissions (CO2eq) from N2O (AR5) | 3050 | FAO TIER 1 | kilotonnes | ... | 4.677701e+04 | 4.806582e+04 | 5.679442e+04 | 5.569714e+04 | 5.535058e+04 | 4.343587e+04 | 5.223020e+04 | NaN | NaN | NaN |
31364 | NaN | 5000 | World | 6516 | Land Use change | 723113 | Emissions (CO2eq) (AR5) | 3050 | FAO TIER 1 | kilotonnes | ... | 3.603971e+06 | 3.958288e+06 | 4.009167e+06 | 3.138158e+06 | 3.090520e+06 | 3.257151e+06 | 3.502744e+06 | NaN | NaN | NaN |
184 rows × 72 columns
INFO:
print( ">>>>>>> CAIT <<<<<<<<")
df_cait = pd.read_csv("../raw_data/CAIT/CW_HistoricalEmissions_CAIT.csv")
#clean
df_cait = df_cait.rename(columns={"Country":"code"})
# COUNTRY_DICT NOT NECESSARY HERE, INVERSE (creating names out of ISO codes)
df_cait.insert(1, "name", np.nan)
COUNTRY_NAME_GENERATOR_FROM_ISO3(df_cait, show_missing = True, show_duplicates_nan = True)
#Global (included in dataset!)
df_cait_global = df_cait[df_cait.code=="WORLD"]
df_cait_global.to_csv("../clean_data/global_subsets/CAIT_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_cait.to_csv("../clean_data/CAIT.csv", index=None)
df_cait = pd.read_csv("../clean_data/CAIT.csv")
df_cait
>>>>>>> CAIT <<<<<<<< MISSING names: ['EUU', 'WORLD'] >>> 0 DUPLICATES:
code | name | Source | Sector | Gas | 1990 | 1991 | 1992 | 1993 | 1994 | ... | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AFG | Afghanistan | CAIT | Total excluding LUCF | All GHG | 15.182846 | 15.102011 | 13.634687 | 13.463994 | 13.271733 | ... | 36.955463 | 44.906165 | 58.651861 | 66.749281 | 74.796108 | 84.619232 | 93.728621 | 95.372841 | 97.300113 | 98.920758 |
1 | AFG | Afghanistan | CAIT | Total including LUCF | All GHG | 12.794043 | 12.713208 | 11.245884 | 11.075191 | 10.882930 | ... | 37.077364 | 45.028066 | 58.405642 | 66.503062 | 74.549889 | 84.373013 | 93.482402 | 95.527498 | 97.454770 | 99.075415 |
2 | AFG | Afghanistan | CAIT | Energy | All GHG | 5.829497 | 5.334624 | 3.760858 | 3.422760 | 3.102594 | ... | 20.140819 | 26.056618 | 39.529423 | 47.592229 | 55.527034 | 64.675839 | 74.741644 | 75.932911 | 77.718177 | 79.580444 |
3 | AFG | Afghanistan | CAIT | Industrial Processes | All GHG | 0.051879 | 0.054500 | 0.060111 | 0.062722 | 0.065343 | ... | 0.222971 | 0.248895 | 0.313896 | 0.378967 | 0.449909 | 0.534630 | 0.592081 | 0.758807 | 0.911544 | 1.064280 |
4 | AFG | Afghanistan | CAIT | Agriculture | All GHG | 8.072853 | 8.396465 | 8.409491 | 8.486480 | 8.523959 | ... | 13.857023 | 15.788381 | 15.902260 | 15.777792 | 15.724861 | 16.220447 | 15.112568 | 15.315738 | 15.221951 | 14.744536 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9077 | ZWE | Zimbabwe | CAIT | Other Fuel Combustion | N2O | 0.407812 | 0.417100 | 0.426387 | 0.435675 | 0.444962 | ... | 0.876841 | 0.901256 | 1.023247 | 1.145238 | 1.267229 | 1.389220 | 1.511210 | 1.536870 | 1.562529 | 1.588188 |
9078 | ZWE | Zimbabwe | CAIT | Fugitive Emissions | N2O | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
9079 | ZWE | Zimbabwe | CAIT | Total excluding LUCF | F-Gas | 0.067604 | 0.064739 | 0.061875 | 0.059011 | 0.056146 | ... | 0.300719 | 0.323994 | 0.379388 | 0.434782 | 0.490176 | 0.545570 | 0.600965 | 0.665562 | 0.730160 | 0.794758 |
9080 | ZWE | Zimbabwe | CAIT | Total including LUCF | F-Gas | 0.067604 | 0.064739 | 0.061875 | 0.059011 | 0.056146 | ... | 0.300719 | 0.323994 | 0.379388 | 0.434782 | 0.490176 | 0.545570 | 0.600965 | 0.665562 | 0.730160 | 0.794758 |
9081 | ZWE | Zimbabwe | CAIT | Industrial Processes | F-Gas | 0.067604 | 0.064739 | 0.061875 | 0.059011 | 0.056146 | ... | 0.300719 | 0.323994 | 0.379388 | 0.434782 | 0.490176 | 0.545570 | 0.600965 | 0.665562 | 0.730160 | 0.794758 |
9082 rows × 34 columns
INFO:
NOTES:
print( ">>>>>>> EDGAR <<<<<<<<")
"""
IMPORTANT: EDGAR website separates fossil CO2 emissions and biomass (organic) emissions. In their main methodology paper they exclude biomass emissions
(and other LULUCF emission), so we will do the same here too. To reiterate - biomass (organic) emissions are OMITTED here, but has its own dataset ("df_co2_org").
"""
#separate co2-organic (biomass) dataset (excluded from main dataset)
df_co2_org = pd.read_excel("../raw_data/" + "EDGAR/v60_CO2_org_short-cycle_C_1970_2018.xls", 0, header=9)
df_co2_org.insert(7,"gas","CO2 (bio)")
#read MAIN data
df_co2_fossil = pd.read_excel("../raw_data/" + "EDGAR/v60_CO2_excl_short-cycle_org_C_1970_2018.xls", 0, header=9)
df_co2_fossil.insert(7,"gas","CO2 (fossil)")
df_ch4 = pd.read_excel("../raw_data/" + "EDGAR/v60_CH4_1970_2018.xls", 0, header=9)
df_ch4.insert(7,"gas","CH4")
df_n2o = pd.read_excel("../raw_data/" + "EDGAR/v60_N2O_1970_2018.xls", 0, header=9)
df_n2o.insert(7,"gas","N2O")
#convert into one dataframe
df_edgar = pd.concat([df_ch4, df_co2_fossil, df_n2o]).reset_index().drop("index", axis=1)
#clean
df_edgar = df_edgar.rename(columns={"Name":"name"})
df_edgar = df_edgar.rename(columns={"Country_code_A3":"code"})
df_edgar.columns = df_edgar.columns.str.replace("Y_","")
#df_edgar.ipcc_code_2006_for_standard_report = df_edgar.ipcc_code_2006_for_standard_report.str.replace(".","")
# IMPORTANT: CONVERT TO CO2equivalents!
GWP_100_ipcc_ar6 = {
"CO2 (fossil)" : 1,
"CO2 (bio)" : 1,
"CH4" : 27.9,
"N2O" : 273,
}
for i in range(len(df_edgar)):
df_edgar.loc[i,"1970":] = df_edgar.loc[i,"1970":] * GWP_100_ipcc_ar6[df_edgar.loc[i,"gas"]]
# divide by 1000 (converting gigagrams to Mt)
df_edgar.loc[:,"1970":] = df_edgar.loc[:,"1970":] / 1000
#add unit column
df_edgar.insert(8,"unit","MtCO2")
#separate co2-organic (biomass) dataset
df_co2_org = pd.read_excel("../raw_data/" + "EDGAR/v60_CO2_org_short-cycle_C_1970_2018.xls", 0, header=9)
df_co2_org.insert(7,"gas","CO2 (bio)")
#no need to make codes nor names!
# woooo! (except if I want to make codes for country groups such as "IPCC_annex" or "C_group_IM24_sh")
#global! (not inlcluded in main)
df_edgar_global = df_edgar.groupby(["IPCC_annex","ipcc_code_2006_for_standard_report", "ipcc_code_2006_for_standard_report_name","fossil_bio", "gas", "unit"]).sum().reset_index()
df_edgar_global.to_csv("../clean_data/global_subsets/EDGAR_global.csv")
print("NO DUPLICATES HERE!")
#------------------------------------------------------------------------------------------------------------------------------
df_edgar.to_csv("../clean_data/EDGAR.csv", index=None)
df_edgar = pd.read_csv("../clean_data/EDGAR.csv")
df_edgar_global
>>>>>>> EDGAR <<<<<<<< NO DUPLICATES HERE!
IPCC_annex | ipcc_code_2006_for_standard_report | ipcc_code_2006_for_standard_report_name | fossil_bio | gas | unit | 1970 | 1971 | 1972 | 1973 | ... | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | Annex_I | 1.A.1.a | Main Activity Electricity and Heat Production | fossil | CO2 (fossil) | MtCO2 | 3213.815550 | 3315.826787 | 3539.201951 | 3814.658085 | ... | 5434.038774 | 5697.005749 | 5702.852710 | 5609.772496 | 5489.523630 | 5299.182358 | 5077.945208 | 4929.190635 | 4813.571128 | 4806.399160 |
8 | Annex_I | 1.A.1.bc | Petroleum Refining - Manufacture of Solid Fuel... | fossil | CO2 (fossil) | MtCO2 | 569.719736 | 570.625564 | 595.947283 | 658.840504 | ... | 681.302764 | 702.701907 | 663.974103 | 668.893197 | 674.105611 | 675.014980 | 687.200447 | 678.450499 | 675.199487 | 676.541685 |
13 | Annex_I | 1.A.2 | Manufacturing Industries and Construction | fossil | CO2 (fossil) | MtCO2 | 2972.892721 | 2706.383463 | 2762.210279 | 2907.952202 | ... | 1413.887302 | 1539.465763 | 1524.550049 | 1549.817713 | 1528.040393 | 1536.131137 | 1510.605271 | 1516.043312 | 1531.083444 | 1561.854007 |
16 | Annex_I | 1.A.3.a | Civil Aviation | fossil | CO2 (fossil) | MtCO2 | 158.235781 | 163.988506 | 168.030012 | 172.142863 | ... | 208.858129 | 211.842706 | 216.147437 | 217.164181 | 221.185512 | 224.288113 | 230.979320 | 239.397098 | 250.956904 | 254.104536 |
21 | Annex_I | 1.A.3.b_noRES | Road Transportation no resuspension | fossil | CO2 (fossil) | MtCO2 | 1412.217678 | 1489.615814 | 1599.277709 | 1704.497199 | ... | 2931.487375 | 2987.419348 | 2914.213790 | 2856.974915 | 2902.279772 | 2909.505739 | 2983.026499 | 3010.393941 | 3032.812185 | 3042.232143 |
26 | Annex_I | 1.A.3.c | Railways | fossil | CO2 (fossil) | MtCO2 | 150.252780 | 142.100079 | 145.504659 | 147.813581 | ... | 48.905404 | 51.225435 | 61.246852 | 56.406584 | 57.184112 | 57.149435 | 59.446145 | 57.974854 | 58.674508 | 59.333129 |
31 | Annex_I | 1.A.3.d | Water-borne Navigation | fossil | CO2 (fossil) | MtCO2 | 78.824287 | 83.041373 | 80.029488 | 85.261555 | ... | 59.801974 | 56.407708 | 66.789075 | 65.497970 | 62.884001 | 59.939346 | 61.688319 | 61.607988 | 61.152662 | 61.139501 |
36 | Annex_I | 1.A.3.e | Other Transportation | fossil | CO2 (fossil) | MtCO2 | 76.675159 | 77.660443 | 79.308104 | 76.992672 | ... | 122.692584 | 139.784968 | 136.271811 | 128.427155 | 134.693367 | 125.875656 | 124.428583 | 131.610487 | 133.308594 | 142.276621 |
41 | Annex_I | 1.A.4 | Other Sectors | fossil | CO2 (fossil) | MtCO2 | 2202.055872 | 2206.685635 | 2285.567351 | 2305.021305 | ... | 1799.485279 | 1832.718520 | 1730.958026 | 1656.426855 | 1740.985484 | 1674.202368 | 1667.363322 | 1669.706138 | 1698.416411 | 1743.825979 |
46 | Annex_I | 1.A.5 | Non-Specified | fossil | CO2 (fossil) | MtCO2 | 204.808594 | 213.331921 | 225.844137 | 239.613884 | ... | 22.101318 | 22.339847 | 28.709985 | 30.122639 | 30.325389 | 29.962261 | 29.948848 | 29.505735 | 28.014742 | 27.777712 |
51 | Annex_I | 1.B.1 | Solid Fuels | fossil | CO2 (fossil) | MtCO2 | 278.961408 | 253.984905 | 256.808946 | 272.801035 | ... | 93.244887 | 104.790282 | 103.309120 | 109.526988 | 94.738056 | 90.175374 | 120.173820 | 100.622461 | 108.655472 | 107.413885 |
54 | Annex_I | 1.B.2 | Oil and Natural Gas | fossil | CO2 (fossil) | MtCO2 | 44.639152 | 45.309202 | 44.960362 | 49.613265 | ... | 113.484430 | 94.738898 | 102.790347 | 77.567246 | 73.188266 | 77.043578 | 79.850279 | 78.059969 | 74.873738 | 86.814592 |
56 | Annex_I | 2.A.1 | Cement production | fossil | CO2 (fossil) | MtCO2 | 218.694788 | 226.591632 | 246.536747 | 252.978564 | ... | 197.423490 | 203.470836 | 208.228500 | 208.341352 | 210.026684 | 215.595958 | 210.448760 | 211.044965 | 216.719631 | 217.435656 |
57 | Annex_I | 2.A.2 | Lime production | fossil | CO2 (fossil) | MtCO2 | 69.836505 | 70.366975 | 73.739580 | 77.143060 | ... | 55.619768 | 63.420261 | 62.790133 | 62.221542 | 60.837754 | 61.043851 | 58.408170 | 57.107250 | 58.403250 | 59.041500 |
58 | Annex_I | 2.A.3 | Glass Production | fossil | CO2 (fossil) | MtCO2 | 5.997217 | 6.049004 | 6.100467 | 6.151279 | ... | 7.296280 | 8.398117 | 8.520551 | 8.262182 | 8.709437 | 8.075784 | 7.755317 | 7.640533 | 7.552652 | 7.487444 |
59 | Annex_I | 2.A.4 | Other Process Uses of Carbonates | fossil | CO2 (fossil) | MtCO2 | 79.831092 | 80.004196 | 80.353056 | 80.537018 | ... | 54.686823 | 60.498810 | 61.561995 | 56.832102 | 55.661802 | 56.312477 | 54.243322 | 53.547126 | 52.899752 | 52.042818 |
61 | Annex_I | 2.B | Chemical Industry | fossil | CO2 (fossil) | MtCO2 | 121.914103 | 120.777157 | 130.596123 | 138.419563 | ... | 191.530653 | 206.524810 | 212.735650 | 207.968714 | 209.909273 | 209.372942 | 217.631730 | 219.316223 | 227.640213 | 229.836230 |
64 | Annex_I | 2.C | Metal Industry | fossil | CO2 (fossil) | MtCO2 | 168.226839 | 159.212107 | 171.170031 | 180.915257 | ... | 74.735119 | 86.198840 | 88.845487 | 87.322426 | 85.795342 | 84.873300 | 82.447748 | 81.257518 | 82.014755 | 82.247114 |
65 | Annex_I | 2.D | Non-Energy Products from Fuels and Solvent Use | fossil | CO2 (fossil) | MtCO2 | 217.887817 | 216.958287 | 217.087427 | 217.293264 | ... | 128.569987 | 129.521926 | 125.700205 | 124.439956 | 124.462660 | 125.046265 | 124.982794 | 124.508358 | 127.110742 | 127.784554 |
72 | Annex_I | 3.C.2 | Liming | fossil | CO2 (fossil) | MtCO2 | 30.197873 | 30.487865 | 30.905281 | 31.407229 | ... | 15.530726 | 16.439274 | 15.425627 | 17.540868 | 15.752995 | 16.445840 | 15.856211 | 15.156252 | 15.384944 | 15.956536 |
73 | Annex_I | 3.C.3 | Urea application | fossil | CO2 (fossil) | MtCO2 | 3.245495 | 3.415580 | 3.711844 | 5.050244 | ... | 10.260019 | 11.643962 | 12.124360 | 13.023224 | 13.312837 | 13.791025 | 14.473040 | 15.471824 | 15.593926 | 14.694288 |
84 | Annex_I | 4.C | Incineration and Open Burning of Waste | fossil | CO2 (fossil) | MtCO2 | 7.101811 | 7.255954 | 7.414454 | 7.575319 | ... | 10.808026 | 11.215204 | 11.074501 | 10.202975 | 11.735014 | 10.183195 | 10.936423 | 10.902948 | 10.632786 | 10.520028 |
90 | Annex_I | 5.B | Other | fossil | CO2 (fossil) | MtCO2 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | ... | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 | 4.919200 |
93 | Int. Aviation | 1.A.3.a | Civil Aviation | fossil | CO2 (fossil) | MtCO2 | 168.816953 | 168.816953 | 178.760913 | 186.622555 | ... | 433.421749 | 457.471950 | 474.450089 | 478.193813 | 486.338265 | 502.588404 | 530.376870 | 553.786633 | 584.863307 | 609.402253 |
99 | Int. Shipping | 1.A.3.d | Water-borne Navigation | fossil | CO2 (fossil) | MtCO2 | 353.856240 | 353.856240 | 370.877710 | 391.732931 | ... | 616.141890 | 662.889335 | 663.506321 | 616.930412 | 614.897420 | 636.150763 | 660.114846 | 678.676733 | 697.135045 | 703.568764 |
106 | Non-Annex_I | 1.A.1.a | Main Activity Electricity and Heat Production | fossil | CO2 (fossil) | MtCO2 | 481.972015 | 481.971973 | 522.113082 | 573.778259 | ... | 6330.379121 | 6851.251319 | 7436.644571 | 7808.554064 | 8155.119229 | 8305.291302 | 8350.858858 | 8502.831447 | 8813.023022 | 9098.673398 |
111 | Non-Annex_I | 1.A.1.bc | Petroleum Refining - Manufacture of Solid Fuel... | fossil | CO2 (fossil) | MtCO2 | 110.283691 | 110.283691 | 119.690529 | 137.044896 | ... | 838.271028 | 919.961733 | 952.375893 | 964.366690 | 993.399335 | 973.670401 | 943.527349 | 888.204006 | 886.144087 | 878.262493 |
116 | Non-Annex_I | 1.A.2 | Manufacturing Industries and Construction | fossil | CO2 (fossil) | MtCO2 | 906.198487 | 906.198487 | 949.031416 | 998.046671 | ... | 4188.614163 | 4564.528872 | 4771.264240 | 4789.410568 | 4837.997749 | 4873.564068 | 4853.064410 | 4719.917569 | 4699.469276 | 4849.533218 |
119 | Non-Annex_I | 1.A.3.a | Civil Aviation | fossil | CO2 (fossil) | MtCO2 | 8.122051 | 8.122051 | 8.630163 | 10.112890 | ... | 68.594142 | 79.635974 | 79.198116 | 84.227215 | 91.221945 | 98.732773 | 110.371913 | 119.859817 | 132.652504 | 134.203412 |
124 | Non-Annex_I | 1.A.3.b_noRES | Road Transportation no resuspension | fossil | CO2 (fossil) | MtCO2 | 280.141830 | 280.142421 | 305.634183 | 336.764357 | ... | 2083.287645 | 2217.565662 | 2352.713089 | 2517.077426 | 2627.879028 | 2707.375148 | 2787.817842 | 2862.162732 | 2934.617171 | 2968.618736 |
129 | Non-Annex_I | 1.A.3.c | Railways | fossil | CO2 (fossil) | MtCO2 | 77.796687 | 77.796687 | 76.707723 | 74.696596 | ... | 33.830839 | 34.582529 | 34.286817 | 34.325872 | 33.802275 | 32.907673 | 31.214421 | 31.017499 | 31.046222 | 31.665284 |
134 | Non-Annex_I | 1.A.3.d | Water-borne Navigation | fossil | CO2 (fossil) | MtCO2 | 10.115653 | 10.115653 | 10.454572 | 10.746488 | ... | 75.657565 | 79.343131 | 83.150809 | 92.082735 | 99.828432 | 104.772500 | 94.283541 | 97.276527 | 109.123498 | 111.957314 |
139 | Non-Annex_I | 1.A.3.e | Other Transportation | fossil | CO2 (fossil) | MtCO2 | 20.960383 | 20.960383 | 23.212701 | 25.303250 | ... | 26.178816 | 19.083983 | 22.723168 | 22.630016 | 23.769235 | 33.792333 | 37.956614 | 25.573964 | 23.294117 | 23.846574 |
144 | Non-Annex_I | 1.A.4 | Other Sectors | fossil | CO2 (fossil) | MtCO2 | 390.792380 | 390.799156 | 411.673107 | 434.372386 | ... | 1262.638512 | 1287.145972 | 1347.520260 | 1376.403126 | 1401.535357 | 1422.586527 | 1449.760662 | 1474.621951 | 1512.691057 | 1561.795249 |
149 | Non-Annex_I | 1.A.5 | Non-Specified | fossil | CO2 (fossil) | MtCO2 | 129.137998 | 129.137998 | 137.105276 | 143.001223 | ... | 196.662156 | 189.123397 | 168.643437 | 192.375466 | 193.834117 | 192.246426 | 187.286185 | 175.630912 | 171.881567 | 175.947018 |
154 | Non-Annex_I | 1.B.1 | Solid Fuels | fossil | CO2 (fossil) | MtCO2 | 36.158074 | 36.137505 | 38.751911 | 41.732591 | ... | 197.596966 | 181.416237 | 193.019849 | 216.019351 | 229.400934 | 271.948773 | 298.748228 | 307.269490 | 333.390693 | 340.755272 |
157 | Non-Annex_I | 1.B.2 | Oil and Natural Gas | fossil | CO2 (fossil) | MtCO2 | 260.368171 | 299.467864 | 340.348653 | 408.971353 | ... | 195.590476 | 192.452535 | 188.395484 | 219.951241 | 219.878046 | 225.941368 | 226.104882 | 232.502869 | 221.920100 | 219.109310 |
159 | Non-Annex_I | 2.A.1 | Cement production | fossil | CO2 (fossil) | MtCO2 | 75.223060 | 83.256095 | 90.135629 | 95.786610 | ... | 1096.180805 | 1184.484004 | 1269.356520 | 1326.275858 | 1156.431352 | 1191.198661 | 1164.515613 | 1192.483386 | 1216.988889 | 1230.617356 |
160 | Non-Annex_I | 2.A.2 | Lime production | fossil | CO2 (fossil) | MtCO2 | 46.273097 | 47.015480 | 50.974940 | 54.870285 | ... | 162.234437 | 167.418485 | 183.382219 | 198.828190 | 207.340475 | 207.692133 | 222.596250 | 252.635250 | 253.307250 | 261.049500 |
161 | Non-Annex_I | 2.A.3 | Glass Production | fossil | CO2 (fossil) | MtCO2 | 0.254226 | 0.260681 | 0.267101 | 0.273426 | ... | 1.551723 | 1.562937 | 1.577265 | 1.588149 | 1.593403 | 1.599149 | 1.606591 | 1.615320 | 1.624116 | 1.632998 |
162 | Non-Annex_I | 2.A.4 | Other Process Uses of Carbonates | fossil | CO2 (fossil) | MtCO2 | 12.598031 | 12.665230 | 12.798998 | 12.879114 | ... | 37.624196 | 39.870247 | 42.409933 | 55.145332 | 53.800110 | 58.033683 | 61.136953 | 61.640663 | 65.958367 | 69.359377 |
164 | Non-Annex_I | 2.B | Chemical Industry | fossil | CO2 (fossil) | MtCO2 | 20.032803 | 20.655604 | 23.482605 | 25.590035 | ... | 361.402422 | 390.720217 | 418.398872 | 433.482287 | 443.212629 | 455.948287 | 461.393466 | 468.210308 | 474.369416 | 477.794156 |
167 | Non-Annex_I | 2.C | Metal Industry | fossil | CO2 (fossil) | MtCO2 | 73.011816 | 68.118190 | 72.769041 | 75.923171 | ... | 170.951207 | 197.616612 | 215.291040 | 232.315176 | 258.501550 | 285.161552 | 285.330759 | 287.343546 | 300.090507 | 316.478164 |
168 | Non-Annex_I | 2.D | Non-Energy Products from Fuels and Solvent Use | fossil | CO2 (fossil) | MtCO2 | 19.820817 | 20.268858 | 20.950655 | 21.650757 | ... | 67.400468 | 64.009411 | 64.876564 | 66.284667 | 67.713874 | 68.787320 | 68.243710 | 69.882354 | 71.911656 | 73.267610 |
175 | Non-Annex_I | 3.C.2 | Liming | fossil | CO2 (fossil) | MtCO2 | 13.289532 | 14.130215 | 17.388798 | 16.801431 | ... | 41.191744 | 41.439955 | 41.667900 | 42.161204 | 44.504316 | 48.443468 | 43.806236 | 44.372132 | 45.025034 | 45.767186 |
176 | Non-Annex_I | 3.C.3 | Urea application | fossil | CO2 (fossil) | MtCO2 | 5.845461 | 6.482326 | 7.281820 | 7.909864 | ... | 64.165142 | 62.584522 | 66.792198 | 66.974833 | 69.597430 | 66.410853 | 67.597757 | 65.130810 | 64.352208 | 63.198340 |
187 | Non-Annex_I | 4.C | Incineration and Open Burning of Waste | fossil | CO2 (fossil) | MtCO2 | 0.545612 | 0.548953 | 0.569582 | 0.591585 | ... | 4.188982 | 4.515509 | 4.952577 | 5.402767 | 5.805969 | 5.763262 | 5.875250 | 5.986389 | 6.094731 | 6.202563 |
193 | Non-Annex_I | 5.B | Other | fossil | CO2 (fossil) | MtCO2 | 40.810400 | 40.924000 | 41.037500 | 41.151000 | ... | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 | 42.570000 |
48 rows × 55 columns
INFO:
print( ">>>>>>> PRIMAP-hist <<<<<<<<")
"""
Warning! CH4, N2O, NF3 and SF6 have not been converted to CO2eqs yet!
"""
df_primap = pd.read_csv("../raw_data/PRIMAP-hist/Guetschow-et-al-2021-PRIMAP-hist_v2.3.1_20-Sep_2021.csv")
#clean
df_primap = df_primap.rename(columns={"area (ISO3)":"code"})
# COUNTRY_DICT NOT NECESSARY HERE, INVERSE (creating names out of ISO codes)
df_primap.insert(3, "name", np.nan)
COUNTRY_NAME_GENERATOR_FROM_ISO3(df_primap, show_missing = True, show_duplicates_nan = True)
#global
df_primap_global = df_primap[df_primap.code=="EARTH"]
df_primap_global.to_csv("../clean_data/global_subsets/PRIMAP-hist_global.csv")#[df_primap_global["category (IPCC2006_PRIMAP)"]=="M.0.EL"].groupby(["entity","unit","scenario (PRIMAP-hist)"]).sum()["2019"]/1000
print("NO DUPLICATES HERE!")
#------------------------------------------------------------------------------------------------------------------------------
df_primap.to_csv("../clean_data/PRIMAP-hist.csv", index=None)
df_primap = pd.read_csv("../clean_data/PRIMAP-hist.csv")
df_primap_global
>>>>>>> PRIMAP-hist <<<<<<<< MISSING names: ['ANNEXI', 'AOSIS', 'BASIC', 'EARTH', 'EU27BX', 'LDC', 'NONANNEXI', 'UMBRELLA'] >>> 0 DUPLICATES: NO DUPLICATES HERE!
source | scenario (PRIMAP-hist) | code | name | entity | unit | category (IPCC2006_PRIMAP) | 1750 | 1751 | 1752 | ... | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3799 | PRIMAP-hist_v2.3.1 | HISTCR | EARTH | NaN | CH4 | Gg CH4 / yr | 1 | 5450.0 | 5470.0 | 5480.0 | ... | 110000.0 | 112000.00 | 112000.00 | 109000.00 | 108000.00 | 107000.00 | 107000.00 | 107000.00 | 108000.0 | 111000.00 |
3800 | PRIMAP-hist_v2.3.1 | HISTCR | EARTH | NaN | CH4 | Gg CH4 / yr | 1.A | 5400.0 | 5410.0 | 5430.0 | ... | 11000.0 | 10900.00 | 10900.00 | 10900.00 | 10900.00 | 11000.00 | 10900.00 | 11000.00 | 11100.0 | 10900.00 |
3801 | PRIMAP-hist_v2.3.1 | HISTCR | EARTH | NaN | CH4 | Gg CH4 / yr | 1.B | 53.0 | 53.0 | 53.1 | ... | 98700.0 | 101000.00 | 101000.00 | 98500.00 | 96700.00 | 96500.00 | 96000.00 | 96300.00 | 96900.0 | 99600.00 |
3802 | PRIMAP-hist_v2.3.1 | HISTCR | EARTH | NaN | CH4 | Gg CH4 / yr | 1.B.1 | 53.0 | 53.0 | 53.1 | ... | 37800.0 | 40000.00 | 39100.00 | 37300.00 | 35700.00 | 35100.00 | 33600.00 | 33900.00 | 33500.0 | 36200.00 |
3803 | PRIMAP-hist_v2.3.1 | HISTCR | EARTH | NaN | CH4 | Gg CH4 / yr | 1.B.2 | 0.0 | 0.0 | 0.0 | ... | 60800.0 | 61100.00 | 61600.00 | 61000.00 | 60900.00 | 61300.00 | 62300.00 | 62300.00 | 63300.0 | 63300.00 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
18029 | PRIMAP-hist_v2.3.1 | HISTTP | EARTH | NaN | PFCS (AR4GWP100) | Gg CO2 / yr | M.0.EL | 0.0 | 0.0 | 0.0 | ... | 148000.0 | 115000.00 | 112000.00 | 114000.00 | 113000.00 | 113000.00 | 122000.00 | 121000.00 | 120000.0 | 120000.00 |
18030 | PRIMAP-hist_v2.3.1 | HISTTP | EARTH | NaN | PFCS (SARGWP100) | Gg CO2 / yr | 2 | 0.0 | 0.0 | 0.0 | ... | 126000.0 | 98200.00 | 95700.00 | 97300.00 | 96600.00 | 96300.00 | 104000.00 | 103000.00 | 102000.0 | 102000.00 |
18031 | PRIMAP-hist_v2.3.1 | HISTTP | EARTH | NaN | PFCS (SARGWP100) | Gg CO2 / yr | M.0.EL | 0.0 | 0.0 | 0.0 | ... | 126000.0 | 98200.00 | 95700.00 | 97300.00 | 96600.00 | 96300.00 | 104000.00 | 103000.00 | 102000.0 | 102000.00 |
18032 | PRIMAP-hist_v2.3.1 | HISTTP | EARTH | NaN | SF6 | Gg SF6 / yr | 2 | 0.0 | 0.0 | 0.0 | ... | 5.3 | 5.35 | 7.55 | 7.87 | 6.43 | 6.24 | 6.35 | 6.68 | 7.2 | 7.13 |
18033 | PRIMAP-hist_v2.3.1 | HISTTP | EARTH | NaN | SF6 | Gg SF6 / yr | M.0.EL | 0.0 | 0.0 | 0.0 | ... | 5.3 | 5.35 | 7.55 | 7.87 | 6.43 | 6.24 | 6.35 | 6.68 | 7.2 | 7.13 |
190 rows × 277 columns
INFO:
NOTES:
print( ">>>>>>> Minx et al 2021 <<<<<<<<")
"""
IPCC categorization cannot be integrated into the main dataset (yet). This is because there are various subsectors (and some sectors such as INdustry and Energy) that overlap.
"""
#read file (excel with multiple interesting sheets)
# minx_info = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",0) #only info
# minx_metadata = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",1) #not very useful
minx_data = pd.read_excel("../raw_data/Minx/essd_ghg_data.xlsx",2) #MAIN DATA
# minx_sector_classification = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",3) #includes description and IPCC categories!
# minx_region_classification = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",4) #not useful
# minx_100_yr_gwps = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",5) #both ar5 and ar6 GWPS for all included gases
# minx_CH4_gwps = pd.read_excel("raw_data/Minx/essd_ghg_data.xlsx",6) # CH4 classification and chapters
#choose dataset
df_minx = minx_data
#clean a little
df_minx = df_minx.rename(columns={"country":"name", "ISO":"code", "value":"value_native"})
# add columns for both ar5 and ar6 gwps
df_minx = df_minx.assign(value_ar5=df_minx.value_native*df_minx.gwp100_ar5)
df_minx = df_minx.assign(value_ar6=df_minx.value_native*df_minx.gwp100_ar6)
#codes and names are amazing
print("NO DUPLICATES HERE!")
#global
df_minx_global = df_minx.groupby(["year","region_ar6_6","region_ar6_10","region_ar6_22","region_ar6_dev","sector_title","subsector_title","gas","gwp100_ar5", "gwp100_ar6"]).sum()
df_minx_global = df_minx_global.reset_index()
df_minx_global.to_csv("../clean_data/global_subsets/Minx2021_global.csv")
#------------------------------------------------------------------------------------------------------------------------------
df_minx.to_csv("../clean_data/Minx2021.csv", index=None)
df_minx = pd.read_csv("../clean_data/Minx2021.csv")
df_minx
>>>>>>> Minx et al 2021 <<<<<<<< NO DUPLICATES HERE!
code | name | region_ar6_6 | region_ar6_10 | region_ar6_22 | region_ar6_dev | year | sector_title | subsector_title | gas | gwp100_ar5 | gwp100_ar6 | value_native | value_ar5 | value_ar6 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ABW | Aruba | Latin America and Caribbean | Latin America and Caribbean | Caribbean | developing | 1970 | Buildings | Residential | CO2 | 1 | 1.0 | 5.452099e+02 | 5.452099e+02 | 5.452099e+02 |
1 | ABW | Aruba | Latin America and Caribbean | Latin America and Caribbean | Caribbean | developing | 1970 | Buildings | Residential | CH4 | 28 | 27.0 | 1.609412e-01 | 4.506354e+00 | 4.345413e+00 |
2 | ABW | Aruba | Latin America and Caribbean | Latin America and Caribbean | Caribbean | developing | 1970 | Buildings | Residential | N2O | 265 | 273.0 | 4.802031e-03 | 1.272538e+00 | 1.310954e+00 |
3 | ABW | Aruba | Latin America and Caribbean | Latin America and Caribbean | Caribbean | developing | 1970 | Energy systems | Electricity & heat | CO2 | 1 | 1.0 | 3.091397e+04 | 3.091397e+04 | 3.091397e+04 |
4 | ABW | Aruba | Latin America and Caribbean | Latin America and Caribbean | Caribbean | developing | 1970 | Energy systems | Electricity & heat | CH4 | 28 | 27.0 | 4.155721e-01 | 1.163602e+01 | 1.122045e+01 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
568336 | ZWE | Zimbabwe | Africa | Africa | Eastern Africa | developing | 2020 | Industry | Metals | CO2 | 1 | 1.0 | 7.720092e+05 | 7.720092e+05 | 7.720092e+05 |
568337 | ZWE | Zimbabwe | Africa | Africa | Eastern Africa | developing | 2020 | Industry | Other (industry) | CO2 | 1 | 1.0 | 9.523549e+05 | 9.523549e+05 | 9.523549e+05 |
568338 | ZWE | Zimbabwe | Africa | Africa | Eastern Africa | developing | 2020 | Transport | Other (transport) | CO2 | 1 | 1.0 | 3.099450e+05 | 3.099450e+05 | 3.099450e+05 |
568339 | ZWE | Zimbabwe | Africa | Africa | Eastern Africa | developing | 2020 | Transport | Rail | CO2 | 1 | 1.0 | 1.491236e+05 | 1.491236e+05 | 1.491236e+05 |
568340 | ZWE | Zimbabwe | Africa | Africa | Eastern Africa | developing | 2020 | Transport | Road | CO2 | 1 | 1.0 | 2.236932e+06 | 2.236932e+06 | 2.236932e+06 |
568341 rows × 15 columns
df_minx_global
year | region_ar6_6 | region_ar6_10 | region_ar6_22 | region_ar6_dev | sector_title | subsector_title | gas | gwp100_ar5 | gwp100_ar6 | value_native | value_ar5 | value_ar6 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1970 | Africa | Africa | Eastern Africa | developing | AFOLU | Biomass burning (CH4, N2O) | CH4 | 28 | 27.0 | 4.444608e+03 | 1.244490e+05 | 1.200044e+05 |
1 | 1970 | Africa | Africa | Eastern Africa | developing | AFOLU | Biomass burning (CH4, N2O) | N2O | 265 | 273.0 | 1.152306e+02 | 3.053610e+04 | 3.145794e+04 |
2 | 1970 | Africa | Africa | Eastern Africa | developing | AFOLU | Enteric Fermentation (CH4) | CH4 | 28 | 27.0 | 5.980416e+05 | 1.674517e+07 | 1.614712e+07 |
3 | 1970 | Africa | Africa | Eastern Africa | developing | AFOLU | Managed soils and pasture (CO2, N2O) | CO2 | 1 | 1.0 | 4.944938e+04 | 4.944938e+04 | 4.944938e+04 |
4 | 1970 | Africa | Africa | Eastern Africa | developing | AFOLU | Managed soils and pasture (CO2, N2O) | N2O | 265 | 273.0 | 2.585428e+04 | 6.851385e+06 | 7.058220e+06 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
110296 | 2020 | Middle East | Middle East | Middle East | ldc | Industry | Cement | CO2 | 1 | 1.0 | 8.939050e+05 | 8.939050e+05 | 8.939050e+05 |
110297 | 2020 | Middle East | Middle East | Middle East | ldc | Industry | Chemicals | CO2 | 1 | 1.0 | 3.910756e+05 | 3.910756e+05 | 3.910756e+05 |
110298 | 2020 | Middle East | Middle East | Middle East | ldc | Industry | Other (industry) | CO2 | 1 | 1.0 | 9.325430e+05 | 9.325430e+05 | 9.325430e+05 |
110299 | 2020 | Middle East | Middle East | Middle East | ldc | Transport | Other (transport) | CO2 | 1 | 1.0 | 1.180720e+05 | 1.180720e+05 | 1.180720e+05 |
110300 | 2020 | Middle East | Middle East | Middle East | ldc | Transport | Road | CO2 | 1 | 1.0 | 2.415364e+06 | 2.415364e+06 | 2.415364e+06 |
110301 rows × 13 columns
print( ">>>>>>> UNFCCC Annex I <<<<<<<<")
df_unfccc_ai = pd.read_csv("../raw_data/UNFCCC/detailed-data-by-country-annex-one.csv")
#clean
df_unfccc_ai = df_unfccc_ai.rename(columns={"Party":"name"})
df_unfccc_ai = df_unfccc_ai.replace({"CO₂":"CO2",
"N₂O": "N2O",
"CH₄": "CH4",
"SF₆": "SF6",
"NF₃":"NF3"})
#add GWPs
GWP_100_ipcc_ar6 = {
"Aggregate GHGs" : 1,
"CO2" : 1,
"CH4" : 27.9,
"N2O" : 273,
"SF6" : 25200,
"NF3" : 17400,
"HFCs": 1, #unit is already in CO2eq
"PFCs": 1, #unit is already in CO2eq
"Aggregate F-gases": 1,#unit is already in CO2eq
"Unspecified mix of HFCs and PFCs": 1 #unit is already in CO2eq
}
GWP_100_ipcc_ar6_df = pd.DataFrame(GWP_100_ipcc_ar6.keys(), GWP_100_ipcc_ar6.values()).reset_index().set_index(0)
GWP_100_ipcc_ar6_df.index.name = "gas"
GWP_100_ipcc_ar4 = {
"Aggregate GHGs" : 1,
"CO2" : 1,
"CH4" : 25,
"N2O" : 298,
"SF6" : 22800,
"NF3" : 17200,
"HFCs": 1, #unit is already in CO2eq
"PFCs": 1, #unit is already in CO2eq
"Aggregate F-gases": 1,#unit is already in CO2eq
"Unspecified mix of HFCs and PFCs": 1 #unit is already in CO2eq
}
GWP_100_ipcc_ar4_df = pd.DataFrame(GWP_100_ipcc_ar4.keys(), GWP_100_ipcc_ar4.values()).reset_index().set_index(0)
GWP_100_ipcc_ar4_df.index.name = "gas"
df_unfccc_ai.insert(5,"GWP_AR4_conditional",df_unfccc_ai.merge(GWP_100_ipcc_ar4_df, left_on="Gas", right_on="gas", how="left")["index"])
df_unfccc_ai.insert(6,"GWP_AR6_conditional",df_unfccc_ai.merge(GWP_100_ipcc_ar4_df, left_on="Gas", right_on="gas", how="left")["index"])
# optional stuff
# df_unfccc_ai.insert(1, "IPCC_cat", df_unfccc_ai["Category"].str[0])
# CODE_GENERATOR_ISO3
df_unfccc_ai.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_unfccc_ai, show_missing=True,show_duplicates_nan=True)
#fix units and convert
df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt","Base year":] = df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt","Base year":].div(1000).mul(df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt","Unit":]["GWP_AR6_conditional"], axis=0)
df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt CO₂ equivalent","Base year":] = df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt CO₂ equivalent","Base year":].div(1000).mul(df_unfccc_ai.loc[df_unfccc_ai.Unit=="kt CO₂ equivalent","Unit":]["GWP_AR6_conditional"], axis=0)
df_unfccc_ai.loc[df_unfccc_ai.Unit=="t CO₂ equivalent","Base year":] = df_unfccc_ai.loc[df_unfccc_ai.Unit=="t CO₂ equivalent","Base year":].div(1000000).mul(df_unfccc_ai.loc[df_unfccc_ai.Unit=="t CO₂ equivalent","Unit":]["GWP_AR6_conditional"], axis=0)
df_unfccc_ai.loc[df_unfccc_ai.Unit=="t","Base year":] = df_unfccc_ai.loc[df_unfccc_ai.Unit=="t","Base year":].div(1000000).mul(df_unfccc_ai.loc[df_unfccc_ai.Unit=="t","Unit":]["GWP_AR6_conditional"], axis=0)
df_unfccc_ai.loc[:,"Unit"] = "MtCO2eq"
#------------------------------------------------------------------------------------------------------------------------------
df_unfccc_ai.to_csv("../clean_data/UNFCCC_AI.csv", index=None)
df_unfccc_ai = pd.read_csv("../clean_data/UNFCCC_AI.csv")
df_unfccc_ai
>>>>>>> UNFCCC Annex I <<<<<<<< PATCH APPLIED >>> 2 Missing codes (NaN) in this df ['European Union (Convention)', 'European Union (KP)'] >>> 0 DUPLICATES:
code | name |
---|
code | name | Parent Category | Category | Gas | Unit | GWP_AR4_conditional | GWP_AR6_conditional | Base year | 1990 | ... | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AUS | Australia | 1. Energy | 1.A.1 Energy Industries | Aggregate GHGs | MtCO2eq | 1 | 1 | 143.211000 | 143.211000 | ... | 226.660000 | 220.712000 | 222.471000 | 211.473000 | 205.297000 | 212.042000 | 219.422000 | 218.428000 | 214.601000 | 213.814000 |
1 | AUS | Australia | 1. Energy | 1.A.1 Energy Industries | CH4 | MtCO2eq | 25 | 25 | 0.152966 | 0.152966 | ... | 0.582400 | 0.451877 | 0.545782 | 0.515282 | 0.647907 | 0.727285 | 0.916665 | 0.860737 | 0.952913 | 0.887035 |
2 | AUS | Australia | 1. Energy | 1.A.1 Energy Industries | CO2 | MtCO2eq | 1 | 1 | 142.551000 | 142.551000 | ... | 224.948000 | 219.013000 | 220.698000 | 209.739000 | 203.485000 | 210.371000 | 217.548000 | 216.606000 | 212.752000 | 212.031000 |
3 | AUS | Australia | 1. Energy | 1.A.1 Energy Industries | N2O | MtCO2eq | 298 | 298 | 0.506877 | 0.506877 | ... | 1.128845 | 1.246874 | 1.227379 | 1.219109 | 1.164060 | 0.943790 | 0.958067 | 0.961715 | 0.896387 | 0.896199 |
4 | AUS | Australia | 1. Energy | 1.A.2 Manufacturing Industries and Construction | Aggregate GHGs | MtCO2eq | 1 | 1 | 36.256200 | 36.256200 | ... | 39.742000 | 40.918900 | 42.912900 | 46.012100 | 46.383400 | 42.487300 | 40.968500 | 40.070500 | 40.950900 | 40.792600 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
58316 | USA | United States of America | Totals | Total GHG emissions without LULUCF | NF3 | MtCO2eq | 17200 | 17200 | 0.047922 | 0.047922 | ... | 0.557691 | 0.568919 | 0.572732 | 0.498578 | 0.516564 | 0.566444 | 0.569480 | 0.594546 | 0.616978 | 0.605124 |
58317 | USA | United States of America | Totals | Total GHG emissions without LULUCF | N2O | MtCO2eq | 298 | 298 | 452.656040 | 452.656040 | ... | 454.950640 | 445.575560 | 416.764920 | 463.857860 | 473.989860 | 468.244420 | 450.793540 | 446.272880 | 459.212040 | 457.140940 |
58318 | USA | United States of America | Totals | Total GHG emissions without LULUCF | PFCs | MtCO2eq | 1 | 1 | 24.255700 | 24.255700 | ... | 4.737930 | 7.315630 | 6.404460 | 6.122770 | 5.760310 | 5.204580 | 4.390520 | 4.092780 | 4.699570 | 4.484390 |
58319 | USA | United States of America | Totals | Total GHG emissions without LULUCF | SF6 | MtCO2eq | 22800 | 22800 | 28.845648 | 28.845648 | ... | 7.288636 | 8.207681 | 6.921738 | 6.513755 | 6.557417 | 5.490035 | 6.020796 | 5.873713 | 5.694847 | 5.902920 |
58320 | USA | United States of America | Totals | Total GHG emissions without LULUCF | Unspecified mix of HFCs and PFCs | MtCO2eq | 1 | 1 | 0.227404 | 0.227404 | ... | 9.943280 | 10.644300 | 11.333700 | 12.024100 | 12.686100 | 14.077900 | 15.011000 | 15.964000 | 16.358800 | 16.462600 |
58321 rows × 39 columns
print( ">>>>>>> UNFCCC Non-Annex I <<<<<<<<")
df_unfccc_nai = pd.read_csv("../raw_data/UNFCCC//detailed-data-by-country-non-annex-one.csv")
# clean
df_unfccc_nai = df_unfccc_nai.rename(columns={"Party":"name"})
df_unfccc_nai = df_unfccc_nai.replace({"CO₂":"CO2",
"N₂O": "N2O",
"CH₄": "CH4",
"SF₆": "SF6",
"NF₃":"NF3"})
# IMPOSSIBLE to convert to common units?
# GWP_100_ipcc_ar6 = {
# "Aggregate GHGs" : 1,
# "CO2" : 1,
# "CH4" : 27.9,
# "N2O" : 273,
# "SF6" : 1, #unit is already in CO2eq
# "NF3" : 1, #unit is already in CO2eq
# "HFCs": 1, #unit is already in CO2eq
# "PFCs": 1, #unit is already in CO2eq
# "Aggregate F-gases": 1,#unit is already in CO2eq
# "Unspecified mix of HFCs and PFCs": 1 #unit is already in CO2eq
# }
# GWP_100_ipcc_ar6_df = pd.DataFrame(GWP_100_ipcc_ar6.keys(), GWP_100_ipcc_ar6.values()).reset_index().set_index(0)
# GWP_100_ipcc_ar6_df.index.name = "gas"
# df_unfccc_nai.insert(5,"GWP_AR6_conditional",df_unfccc_nai.merge(GWP_100_ipcc_ar6_df, left_on="Gas", right_on="gas", how="left")["index"])
# df_unfccc_nai.loc[:, "1990":] = df_unfccc_nai.loc[:, "1990":].mul(df_unfccc_nai["GWP_AR6_conditional"], axis=0)
# df_unfccc_nai.loc[:,"Unit"]= "MtCO2eq"
#Convert from Gg to Mt
df_unfccc_nai.loc[:, "1990":] = df_unfccc_nai.loc[:, "1990":].div(1000)
df_unfccc_nai.loc[:,"Unit"]= df_unfccc_nai.loc[:,"Unit"].replace({"Gg CO₂ equivalent":"MtCO2eq",
"Gg": "Mt (substance)"})
# CODE_GENERATOR_ISO3
df_unfccc_nai.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_unfccc_nai, show_missing=True,show_duplicates_nan=True)
#------------------------------------------------------------------------------------------------------------------------------
df_unfccc_nai.to_csv("../clean_data/UNFCCC_NAI.csv", index=None)
df_unfccc_nai = pd.read_csv("../clean_data/UNFCCC_NAI.csv")
df_unfccc_nai
>>>>>>> UNFCCC Non-Annex I <<<<<<<< PATCH APPLIED >>> 0 Missing codes (NaN) in this df [] >>> 0 DUPLICATES:
code | name |
---|
code | name | Parent Category | Category | Gas | Unit | 1990 | 1991 | 1992 | 1993 | ... | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AFG | Afghanistan | 1. Energy | 1.A Fuel Combustion - Sectoral Approach | Aggregate GHGs | MtCO2eq | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | 10.301 | NaN | NaN | NaN | NaN | NaN |
1 | AFG | Afghanistan | 1. Energy | 1.A Fuel Combustion - Sectoral Approach | CH4 | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | 0.002 | NaN | NaN | NaN | NaN | NaN |
2 | AFG | Afghanistan | 1. Energy | 1.A Fuel Combustion - Sectoral Approach | CO2 | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | 9.639 | NaN | NaN | NaN | NaN | NaN |
3 | AFG | Afghanistan | 1. Energy | 1.A Fuel Combustion - Sectoral Approach | N2O | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | 0.002 | NaN | NaN | NaN | NaN | NaN |
4 | AFG | Afghanistan | 1. Energy | 1.A.1 Energy Industries | Aggregate GHGs | MtCO2eq | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | 5.046 | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
25553 | ZWE | Zimbabwe | Totals | Total GHG emissions excluding LULUCF/LUCF | N2O | MtCO2eq | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25554 | ZWE | Zimbabwe | Totals | Total GHG emissions including LULUCF/LUCF | Aggregate GHGs | MtCO2eq | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25555 | ZWE | Zimbabwe | Totals | Total GHG emissions including LULUCF/LUCF | CH4 | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25556 | ZWE | Zimbabwe | Totals | Total GHG emissions including LULUCF/LUCF | CO2 | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25557 | ZWE | Zimbabwe | Totals | Total GHG emissions including LULUCF/LUCF | N2O | Mt (substance) | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25558 rows × 35 columns
print( ">>>>>>> Carbon Monitor <<<<<<")
#download the file
df_carbon_mon = pd.read_csv("https://datas.carbonmonitor.org/API/downloadFullDataset.php?source=carbon_global")
df_carbon_mon.date = pd.to_datetime(df_carbon_mon.date, format= "%d/%m/%Y")
#clean
df_carbon_mon = df_carbon_mon.rename(columns={"country":"name"})
df_carbon_mon.insert(3,"unit","MtCO2")
# CODE_GENERATOR_ISO3
df_carbon_mon.insert(0, "code", np.nan)
CODE_GENERATOR_ISO3(df_carbon_mon, show_missing=True, show_duplicates_nan=True)
# global
df_carbon_mon_global = df_carbon_mon[df_carbon_mon.name=="WORLD"]
df_carbon_mon_global.to_csv("../clean_data/global_subsets/Carbon-Monitor_global.csv")
# #------------------------------------------------------------------------------------------------------------------------------
df_carbon_mon.to_csv("../clean_data/Carbon-Monitor.csv")
df_carbon_mon = pd.read_csv("../clean_data/Carbon-Monitor.csv", index_col=0)
df_carbon_mon
>>>>>>> Carbon Monitor <<<<<< PATCH APPLIED >>> 3 Missing codes (NaN) in this df ['EU27 & UK', 'ROW', 'WORLD'] >>> 0 DUPLICATES:
code | name |
---|
code | name | date | sector | unit | value | timestamp | |
---|---|---|---|---|---|---|---|
0 | BRA | Brazil | 2019-01-01 | Power | MtCO2 | 0.096802 | 1546272000 |
1 | CHN | China | 2019-01-01 | Power | MtCO2 | 14.059700 | 1546272000 |
2 | NaN | EU27 & UK | 2019-01-01 | Power | MtCO2 | 1.872390 | 1546272000 |
3 | FRA | France | 2019-01-01 | Power | MtCO2 | 0.051081 | 1546272000 |
4 | DEU | Germany | 2019-01-01 | Power | MtCO2 | 0.316436 | 1546272000 |
... | ... | ... | ... | ... | ... | ... | ... |
99619 | RUS | Russia | 2022-03-31 | International Aviation | MtCO2 | 0.005468 | 1648656000 |
99620 | ESP | Spain | 2022-03-31 | International Aviation | MtCO2 | 0.050990 | 1648656000 |
99621 | UGA | UK | 2022-03-31 | International Aviation | MtCO2 | 0.071560 | 1648656000 |
99622 | USA | US | 2022-03-31 | International Aviation | MtCO2 | 0.172706 | 1648656000 |
99623 | NaN | WORLD | 2022-03-31 | International Aviation | MtCO2 | 1.126300 | 1648656000 |
99624 rows × 7 columns
df_carbon_majors = pd.read_csv("../clean_data/Carbon-Majors-2018.csv")
df_carbon_majors = df_carbon_majors.set_index("Entity")
df_carbon_majors
MtCO2e | % of global | |
---|---|---|
Entity | ||
Saudi Aramco - Saudi Arabia | 1931 | 4.84% |
Gazprom - Russia | 1527 | 3.83% |
National Iranian Oil Co. - Iran | 1266 | 3.17% |
Coal India - India | 1214 | 3.04% |
PetroChina / China Natl Petroleum | 883 | 2.21% |
Rosneft - Russian Federation | 846 | 2.12% |
Abu Dhabi - United Arab Emirates | 692 | 1.74% |
ExxonMobil - USA | 580 | 1.45% |
Iraq National Oil Co. - Iraq | 566 | 1.42% |
Royal Dutch Shell - The Netherlands | 550 | 1.38% |
BP - UK | 549 | 1.38% |
Kuwait Petroleum - Kuwait | 444 | 1.11% |
Chevron - USA | 443 | 1.11% |
Total SA - France | 404 | 1.01% |
Peabody Energy - USA | 398 | 1.00% |
Sonatrach - Algeria | 398 | 1.00% |
Petrobras - Brazil | 385 | 0.97% |
Pemex - Mexico | 380 | 0.95% |
Glencore - Switzerland | 355 | 0.89% |
Lukoil - Russian Federation | 341 | 0.86% |
Top Twenty | 14153 | 35.49% |
Global (2018) | 39878 | 100.00% |