Visualization of CDIAC-Data on global CO2 Emissions

This notebook is intended to visualize the data on global CO2 emissions from the "carbon dioxide information and analysis center" (CDIAC) over the last centuries. It is inspired mainly as a means to scrutinize the claims given in the positional white paper on animal agriculture and climate change by Stanford alumni Dr. Sailesh Rao. Therein he claims, that the share of emissions caused by animal agriculture amounts to 87% of all anthropogenic green house gas (GHG) as a lower bound, blaming the IPCC reports of being imprecise and full of error. While 87% is a sheer unbelievably high amount on the one side, he makes some excellent points in his paper. Honoring, the royal societies' motto "Nullius in verba" (Latin for 'Take nobody's word for it') I wanted to check the presented data myself.

This python notebook is intended to check in particular his points on opportunity cost through land use and the way to reverse climate change by restoration of natural habitats, in areas which are nowadays mostly used to grow animal feed. I will recreate the data visualized in figures 2.3 and 2.4 in the white paper based on the given sources. The text in this notebook can be followed without reading Dr Rao's paper. However, I would strongly recommend reading it. In the notebook, I will also try to describe briefly what the shown code does, as not every reader can be expected to understand the python programming language.

The following python code is used to call the data from the cited sources and do the necessary processing to visualize it using the Plotly module. Thereby, presenting the code in this jupyter notebook also serves as means of transparency and traceability.

Preparation and loading of data from web sources

First we import necessary modules for loading, processing and displaying the data. The Pandas module comes in very handy as a “fast, powerful, flexible and easy to use open-source data analysis and [processing] tool". It has almost everything built in that we need. Plotly is a graphing library intended for web-grade data visualization, with a mouse-over readout function.

In [1]:
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import urllib
import numpy as np
import ssl
from scipy.interpolate import UnivariateSpline

pd.options.plotting.backend = "plotly"
ssl._create_default_https_context = ssl._create_unverified_context # needed for ssl connections

I'll call the data directly from the online sources given in the following URLs by CDIAC. CDIAC is a governmental data archive on the issue of climate change by the US government. The data is currently being transitioned to U.S. Department of Energy’s (DOE) Environmental System Science, but for the time being the data can be collected from CDIAC.

The URL below is taken from the hyperlink to the "Latest Published Global Estimates" on this page. An inspection of the file header (the first 37 lines of the file from the URL printed below) is useful to examine the data structure and units of the data acquisition. Credit is due to the authors cited in the text file.

In [2]:
url_fossils = "https://cdiac.ess-dive.lbl.gov/ftp/ndp030/global.1751_2014.ems"
In [3]:
# print header i.e. first couple lines of txt file with some data
fossil_rawfile = urllib.request.urlopen(url_fossils)
txt_lines = fossil_rawfile.readlines()
for line in zip(range(37),txt_lines):
    print(line[1][:-1].decode('utf-8'))
***********************************************************
*** Global CO2 Emissions from Fossil-Fuel Burning,      ***
*** Cement Manufacture, and Gas Flaring: 1751-2014      ***
***                                                     ***
*** March 3, 2017                                       ***
***                                                     ***
*** Source:  Tom Boden                                  ***
***          Bob Andres                                 ***
***          Carbon Dioxide Information Analysis Center ***
***          Oak Ridge National Laboratory              ***
***          Oak Ridge, Tennessee 37831-6290            ***
***          USA                                        ***
***                                                     ***
***          Gregg Marland                              ***
***          Research Institute for Environment, Energy ***
***            and Economics                            ***
***          Appalachian State University               ***
***          Boone, North Carolina 28608-2131           ***
***          USA                                        ***
***********************************************************

All emission estimates are expressed in million metric tons of carbon. To
convert these estimates to units of carbon dioxide (CO2), simply multiply
these estimates by 3.667.

Per capita emission estimates are expressed in metric tons of carbon.
Population estimates were not available to permit calculations of global
per capita estimates before 1950.  Please note that annual sums were
tallied before each element (e.g., Gas) was rounded and reported here
so totals may differ slightly from the sum of the elements due to
rounding.

                                                          Cement       Gas         Per
Year     Total         Gas      Liquids      Solids     Production   Flaring     Capita

1751         3           0           0           3           0           0              
1752         3           0           0           3           0           0              

The header of the land use data can be inspected in the same manner. More information on the methods of assessment of the data can be found in the original article by Houghten et al. in the Wiley Online Library.

In [4]:
url_landuse = "https://cdiac.ess-dive.lbl.gov/trends/landuse/houghton/Global_land-use_flux-1850_2005.xls"
In [5]:
# print header i.e. first couple lines of txt file with some data
fossil_rawfile = urllib.request.urlopen("https://cdiac.ess-dive.lbl.gov/trends/landuse/houghton/1850-2005.txt")
txt_lines = fossil_rawfile.readlines()
for line in zip(range(25),txt_lines):
    print(line[1][:-1].decode('utf-8'))
Annual Net Flux of Carbon to the Atmosphere from Land-Use Change:  1850-2005
R.A. Houghton

Units = Tg C (1 teragram = 10^12 g)

Positive values represent a release of carbon to the atmosphere, while
negative values represent a net uptake of carbon by the terrestrial 
biosphere.

Please note:  due to rounding, the sum of the regions included in this textfile may not equal the global
 totals listed in the TRENDS document.  Please refer to the Excel spreadsheet for more precise estimates.

 Year   Global    USA    Canada   South+   Europe   N.Africa Tropical  Former  China   South+     Pacific
                                 Central            +Middle   Africa   USSR           Southeast  Developed
                                 America              East                               Asia     Region
 1850   500.6    164.1     5.5     23.5     55.0      4.0     -1.3     58.6    101.8     87.3      2.0
 1851   492.7    165.7     5.4     23.2     55.0      4.0     -1.1     58.6     93.1     86.9      2.0
 1852   548.5    230.7     5.3     22.9     55.0      4.0     -1.0     58.9     83.8     86.9      2.0
 1853   546.8    238.5     5.3     22.6     55.0      4.0     -1.1     59.2     74.2     87.0      2.0
 1854   544.8    246.2     5.3     22.4     54.9      4.0     -1.0     59.6     64.3     87.1      2.0
 1855   542.1    253.6     5.3     22.2     54.9      4.0     -1.1     60.0     54.2     87.1      2.0
 1856   547.7    260.5     5.2     22.0     54.9      4.0     -1.1     60.3     52.6     87.2      2.0
 1857   553.3    267.2     5.2     21.8     54.8      4.0     -1.1     60.7     51.3     87.3      2.0
 1858   558.6    273.6     5.2     21.7     54.8      4.0     -1.4     61.1     50.2     87.4      2.0
 1859     564    279.7     5.2     21.5     54.8      4.0     -1.6     61.5     49.4     87.5      2.0

Teragrams ($10^{12}$ grams) and a million metric tons amount to the same mass of carbon. The terms "positive carbon flux to the atmosphere" and "carbon emissions" both describe the yearly addition of carbon atoms to the atmosphere. Note that we can also simply multiply the data by a factor of 3.667 to obtain the units of CO2 as stated in the header from the first file. A CO2 molecule has 3.667 times the mass of a carbon atom.

Import

The following cells import the data into two pandas dataframes, which we can use as objects for data handling.

In [6]:
# import data in million metric tonnes carbon per Year
carbon_fossil_emissions = pd.read_csv(url_fossils, skiprows=33, sep='\s+', index_col=0)
In [7]:
# data: 10^12 g is million metric tonnes carbon per year
carbonflux_from_landuse = pd.read_excel(url_landuse, sheet_name='net fluxes', index_col='Year')

Inspection

It is helpful to inspect the imported data frames, which can be done by calling it with the .head() or .tail() methods, to show either the beginning or the end of the frame. This is particularly handy for large data sets.

In [8]:
carbon_fossil_emissions.tail()
Out[8]:
Total Gas Liquids Solids Production Flaring Capita
Year
2010 9128 1696 3107 3812 446 67 1.32
2011 9503 1756 3134 4055 494 64 1.36
2012 9673 1783 3200 4106 519 65 1.36
2013 9773 1806 3220 4126 554 68 1.36
2014 9855 1823 3280 4117 568 68 1.36

When skipping the first 33 lines during the pandas import, it did cut off part of the columns names. The full column names will be reinstated with a dictionary to replace the column names above with the key values from a dictionary, like so:

In [9]:
# better readabillity for labels in plots, also fix broken labels from truncated txt-file
coldict={'Production':'Cement', 'Solids':'Coal', 'Liquids': 'Oil', 'Total': 'Fossil total'}
carbon_fossil_emissions.rename(columns=coldict, inplace=True)
In [11]:
carbon_fossil_emissions.tail()
Out[11]:
Fossil total Gas Oil Coal Cement Flaring Capita
Year
2010 9128 1696 3107 3812 446 67 1.32
2011 9503 1756 3134 4055 494 64 1.36
2012 9673 1783 3200 4106 519 65 1.36
2013 9773 1806 3220 4126 554 68 1.36
2014 9855 1823 3280 4117 568 68 1.36

I took the liberty to rename 'Solids' to 'Coal' and 'Liquids' to 'Oil', which is what they essentially are.

Let's also inspect the data frame on CO2 flux from land use:

In [12]:
carbonflux_from_landuse.head()
Out[12]:
Global USA Canada S+C America Europe Nafrica/Meast Trop.Africa Frmr USSR China S+SE Asia Pac.Dev.Reg
Year
1850 500.6 164.0922 5.5476 23.4757 55.0441 3.9840 -1.3484 58.5571 101.8392 87.3469 2.0458
1851 492.7 165.7256 5.3626 23.1520 55.0156 3.9839 -1.1192 58.5525 93.0766 86.9100 2.0419
1852 548.5 230.6725 5.3380 22.8618 54.9874 3.9837 -1.0033 58.8781 83.8307 86.9385 2.0379
1853 546.8 238.5149 5.3138 22.6017 54.9590 3.9835 -1.0513 59.2207 74.2140 86.9935 2.0338
1854 544.8 246.1846 5.2899 22.3687 54.9304 3.9833 -0.9855 59.5802 64.3129 87.0630 2.0297
In [13]:
carbonflux_from_landuse.tail()
Out[13]:
Global USA Canada S+C America Europe Nafrica/Meast Trop.Africa Frmr USSR China S+SE Asia Pac.Dev.Reg
Year
2001 1385.4 -31.9488 17.6121 643.1904 -18.0804 23.2448 261.6969 20.1075 -12.9153 478.5329 3.9161
2002 1517.7 -31.9488 17.6121 625.5099 -18.0804 23.2448 258.5236 20.1075 -12.9153 631.6960 3.9161
2003 1513.2 -31.9488 17.6121 616.4536 -18.0804 23.2448 225.5212 20.1075 -12.9153 669.2975 3.9161
2004 1534.9 -31.9488 17.6121 609.3525 -18.0804 23.2448 225.7864 20.1075 -12.9153 697.8432 3.9161
2005 1467.3 -31.9488 17.6121 606.4342 -18.0804 23.2448 239.2390 20.1075 -12.9153 619.6937 3.9161

It is noteworthy that at the beginning of our century the US, Europe and China sequestered more carbon by land use than they emitted by land use. This means that woods and nature are seemingly growing more than they are cut down. This is of course a welcome fact, but the 12 and 32 million tons of sequestration per year can hardly compensate the output of billion tons of fossil carbon.

In the other hand, we can see that areas like south and Central America, tropical Africa or south-east Asia, which are largely considered as regions of great biodiversity and sequestration potential, are considered falsely so. Together, they emit more than 1.5 billion tons of carbon per year. This is due to political reasons and the effects of globalization in agriculture. European and American cattle farmers buy animal feed like soy and oil palm seeds, largely from South America and the other aforementioned regions. They in turn cut down their natural habitats in order to grow monocultures and meet the demand for animal feed. Research calls this externalized carbon cost, which can be attributed to products like meat and dairy products.

Processing

Much processing of the data in the databases is not required. To get from tons of carbon to tons of CO2 there is just a factor of 3.667 as stated above. Dividing by 1000 converts the entries from a million tons to a billion tons. This is of advantage, especially when emissions are cumulated over time.

In [14]:
co2flux_from_landuse = carbonflux_from_landuse.applymap(lambda x: x * 3.667 * 1e-3)
co2_fossil_emissions = carbon_fossil_emissions.applymap(lambda x: x * 3.667 * 1e-3)

I'll use linear extrapolation to fill the gap with the missing emissions from land use from 2006 to 2014, so that data on land use and fossil emission cover the same time frame. Admittedly this is not real data, but it's a fair assumption that land use for animal feed increased between 2006 and 2014 when demand for meat was also increasing. For correct numbers, new research data is required, which is currently not available to my knowledge.

In [15]:
#linear extrapolation (trend) of co2 flux from land use for 2006 to 2014
co2flux_from_landuse_expl_fnc = UnivariateSpline(co2flux_from_landuse['Global'].index,
                                                 co2flux_from_landuse['Global'].values,
                                                k=1,
                                                ext='extrapolate')
co2flux_from_landuse_expl = co2flux_from_landuse.reindex(np.arange(1850,2015))
co2flux_from_landuse_expl.loc[2006:2014,'Global'] = co2flux_from_landuse_expl_fnc(np.arange(2006,2015))
In [16]:
co2flux_from_landuse_expl.tail()
Out[16]:
Global USA Canada S+C America Europe Nafrica/Meast Trop.Africa Frmr USSR China S+SE Asia Pac.Dev.Reg
Year
2010 5.967910 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2011 5.995793 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2012 6.023675 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2013 6.051558 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2014 6.079441 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

A distinction between gas and flaring gas is of no particular interest for a comparison of fossil and non-fossil atmospheric carbon sources. I'll just add them up to total gas.

In [17]:
# sum gas and flaring gas emissions to total gas
co2_fossil_emissions['Gas'] = co2_fossil_emissions['Gas'] + co2_fossil_emissions['Flaring']
# drop unneeded columns
co2_fossil_emissions.drop(columns=['Capita', 'Flaring'],inplace=True)

For ease of use, I'll combine the two databases into one. co2_emissions will hold the global emissions, where each column shows the emissions of CO2 in a billion metric tons per year, with regard to the various carbon sources. Also, for the total global emissions, I will include the emissions by land use, so we have to add the global CO2 flux from land use to the fossil total in order to gain the total including land use.

In [18]:
# joint dataframe for fossil and landuse data
co2_emissions = co2_fossil_emissions.copy()
co2_emissions['Land use'] = co2flux_from_landuse_expl['Global'] # just the gobal data is relevant for comparison
co2_emissions['Total'] = co2_fossil_emissions['Fossil total'] + co2flux_from_landuse_expl['Global']
In [19]:
co2_emissions = co2_emissions[['Gas', 'Oil', 'Coal', 'Cement', 'Fossil total', 'Land use', 'Total']]

A quick view to the years 2001 to 2014 shows:

In [20]:
co2_emissions.loc[2001:2014]
Out[20]:
Gas Oil Coal Cement Fossil total Land use Total
Year
2001 4.998121 10.443616 8.965815 0.869079 25.276631 5.080262 30.356893
2002 5.100797 10.384944 9.233506 0.924084 25.646998 5.565406 31.212404
2003 5.298815 10.846986 9.882565 1.012092 27.047792 5.548904 32.596696
2004 5.489499 11.158681 10.656302 1.092766 28.393581 5.628478 34.022059
2005 5.665515 11.250356 11.397036 1.173440 29.490014 5.380589 34.870603
2006 5.852532 11.334697 12.075431 1.305452 30.568112 5.856379 36.424491
2007 5.969876 11.261357 12.548474 1.400794 31.180501 5.884262 37.064763
2008 6.230233 11.378701 13.153529 1.422796 32.181592 5.912145 38.093737
2009 6.050550 11.155014 13.164530 1.521805 31.891899 5.940027 37.831926
2010 6.464921 11.393369 13.978604 1.635482 33.472376 5.967910 39.440286
2011 6.673940 11.492378 14.869685 1.811498 34.847501 5.995793 40.843294
2012 6.776616 11.734400 15.056702 1.903173 35.470891 6.023675 41.494566
2013 6.871958 11.807740 15.130042 2.031518 35.837591 6.051558 41.889149
2014 6.934297 12.027760 15.097039 2.082856 36.138285 6.079441 42.217726

Approximately 35 billion tons of total emissions including land use in 2005 is in line with the data we have from the IPCC. If we subtract land use, we get about 30 gigatons (= billion tons), which is what is reported by other sources by Friedlingstein et al. In 2020 emissions amounted to 36 gigatons from fossils according to the IPCC, so worldwide emissions are still increasing, in spite of the Parisian climate accords.

Looking at the data above again, note in 2005 we emitted worldwide double the amount of CO2 from coal or gas than we emitted from land use or gas. This is particularly noteworthy if we look at the cumulated shares later.

Accumulation of CO2 in the atmosphere

Instead of asking how much anthropogenic CO2 we put into the atmosphere yearly, we should ask, how much of the CO2 ends up staying there due to the various sources, as lingering CO2 is what causes global warming. In other words: In order to reduce emissions, which source would be the most effective to focus to cut down on. Two facts are important: First, roughly 65% of all the airborne CO2 gets sequestered by surface-plants, algae (e.g. diatoms), coral and the micro-biome living in the soil each year. That means only 45% of the CO2 let into the atmosphere stays there. This is called the airborne fraction of CO2. Secondly, non-sequestered stays in the atmosphere basically forever. Energetically, CO2 is a very stable molecule, which gets hardly altered except through the use of external energy. Remember, that we already subtracted 65% that are sequestered by means of the sun's energy and photosynthesis. Also, CO2 has a relatively high mass, so that dissipation into open space in the upper atmosphere is negligible.

In consequence, to gain an estimate for the anthropogenic CO2 in the atmosphere, we just have to sum the emissions for each of the sources and multiply by 0.45.

In [21]:
co2_cumulated_emissions = co2_emissions.cumsum()
co2_atmospheric_remainder = co2_cumulated_emissions.applymap(lambda x: x * 0.45)
In [22]:
co2_atmospheric_remainder.loc[2001:2011]
Out[22]:
Gas Oil Coal Cement Fossil total Land use Total
Year
2001 61.486239 167.516627 239.522573 9.483412 477.997300 247.509299 723.437311
2002 63.781598 172.189852 243.677651 9.899250 489.538449 250.013731 737.482893
2003 66.166065 177.070996 248.124805 10.354691 501.709956 252.510738 752.151406
2004 68.636339 182.092402 252.920141 10.846436 514.487067 255.043554 767.461333
2005 71.185821 187.155063 258.048807 11.374484 527.757574 257.464819 783.153104
2006 73.819460 192.255676 263.482751 11.961937 541.513224 260.100189 799.544125
2007 76.505904 197.323287 269.129564 12.592295 555.544449 262.748107 816.223269
2008 79.309509 202.443702 275.048652 13.232553 570.026166 265.408572 833.365450
2009 82.032257 207.463459 280.972691 13.917365 584.377520 268.081585 850.389817
2010 84.941471 212.590475 287.263062 14.653332 599.440090 270.767144 868.137946
2011 87.944744 217.762045 293.954421 15.468506 615.121465 273.465251 886.517428

The following code plots the data from the data frame object directly, using plotly's syntax to get a better picture. Beginning with the yearly emissions over time with parameterized sources:

In [23]:
fig1 = co2_emissions.plot(labels=dict(index="Year", value="CO2-Emission [Billion Tons per Year]",
                                      variable="Source"),
                          template='none')
fig1.update_xaxes(range=[1750, 2020]);
fig1.update_layout(
    font=dict(size=14),
    legend=dict(
    yanchor="top",
    y=0.95,
    xanchor="left",
    x=0.05,
    title='Source:'
    ),
    template='plotly_dark');
fig1.show()