Estimating the effect of agricultural conservation practices (CP) on reducing nutrient loss¶

We will use the Measured Annual Nutrient loads from AGricultural Environments (MANAGE) data from the USDA, as published by Qian and Harmel, where they used causal inference in order to quantify the effects of water and soil conservation practices in reducing phosphorus loss from agricultural fields.

The data is provided in Rdata format, so we will first have to load it and transform it. We use the rpy2 package, which is not a part of the requirements for causallib and so you will have to install it if reproducing the analysis.

Loading the data¶

In [1]:

import rpy2.robjects as objects
import pandas as pd

In [2]:

url = objects.r['url']('https://onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1111%2F1752-1688.12377&file=jawr12377-sup-0001-manage.RData')
try:
    objects.r['load'](url)
except Exception:  # fallback - if the above doesn't work please download the file directly
    objects.r['load']('jawr12377-sup-0001-manage.rdata')

In [3]:

# normally we would use pandas to transform the R-datafram into a pandas dataframe, 
# from rpy2.robjects import pandas2ri
# pandas2ri.activate() 
# but in many versions of pandas this is broken with categorical data as we have here
# so instead we will use R to save the DataFrame to csv and load it in pandas
objects.r['write.csv'](x=objects.r["LoadApr2007"], file="MANAGE.csv", fileEncoding="UTF-8", **{"row.names": False})
manage = pd.read_csv("MANAGE.csv", index_col='RecordID')
print(manage.shape)
manage.head()

(274, 71)

Out[3]:

	WatershedID	City	State	Location	Date	Watershed.Years	ShortLU	LandUse	Tillage	CP1	...	PP.Form	MaxTPLoad	MinTPLoad	AvgTPLoad	FormTP	FlowIndicator	Comments	Reference	Ref	CPs
RecordID
151	4 Yr. Bahiagrass 1	Tifton, GA	GA	NaN	1952 - 1960	4	Pasture	1st Yr. Bahiagrass, Clover (4 Yr. Rotation of ...	Pasture	NaN	...	NaN	NaN	NaN	NaN	NaN	Surface Runoff	NaN	Thomas, A.W., R.L. Carter, and J.R. Carreker. ...	44	0
152	4 Yr. Bahiagrass 2	Tifton, GA	GA	NaN	1952 - 1960	4	Pasture	2nd Yr. Bahiagrass, Clover (4 Yr. Rotation of ...	Pasture	NaN	...	NaN	NaN	NaN	NaN	NaN	Surface Runoff	NaN	Thomas, A.W., R.L. Carter, and J.R. Carreker. ...	44	0
260	KA-1	Middleton, WI	WI	NaN	2003-2004	2	Alfalfa	Alfalfa	Pasture	NaN	...	NaN	0.12	0.12	0.12	NaN	Surface Runoff	NaN	Cabot, P.E., K.G. Karthikeyan, P.S. Miller, an...	7	0
261	KA-2	Middleton, WI	WI	NaN	2003-2004	2	Alfalfa	Alfalfa	Pasture	NaN	...	NaN	2.21	0.03	1.12	NaN	Surface Runoff	Flow from outside the watershed may have contr...	Cabot, P.E., K.G. Karthikeyan, P.S. Miller, an...	7	0
93	C7	Chickasha, OK	OK	NaN	July 1972 - June 1973	1	Alfalfa	Alfalfa (Dryland)	Pasture	NaN	...	NaN	NaN	NaN	2.48	NaN	Surface Runoff	Estimated precipitation, runoff, and soil loss...	Olness, A., S.J. Smith, E.D. Rhoades, and R.G....	29	0

5 rows × 71 columns

Data pre-processing¶

We will follow the procedure used by Qian and Harmel to preprocess the data, which includes creating a few log-transforms and cleaning the data.

The intervention we are interested in is one of either contour farming, terracing, buffering, or creating a waterway. Together these are referred to as conservation practices, or CP. The outcome of interest is the amount of phosphorus that was measured in runoff water (AvgTPLoad). To see why this is a problems see an explanation from the USGS and from the EPA.

In [4]:

import numpy as np
#manage.fillna(manage.median(axis=0), inplace=True)
manage["AvgTPLoad"].clip(0.002, inplace=True)
manage["AvgTNLoad"].clip(0.02, inplace=True)
manage["F.AppMethod1"].replace("Other", "Surface Applied", inplace=True)
manage["F.AppMethod1"].replace("Other2", "Unknown", inplace=True)

manage["CFarm"] = (manage[["CP1", "CP2", "CP3"]] == "Contour Farming").any(axis=1)
manage["Buffer"] = ((manage[["CP1", "CP2", "CP3"]] == "Filter Strip").any(axis=1)) | \
                   ((manage[["CP1", "CP2", "CP3"]] == "Riparian Buffer").any(axis=1))
manage["Terrace"] = (manage[["CP1", "CP2", "CP3"]] == "Terrace").any(axis=1)
manage["Waterway"] = (manage[["CP1", "CP2", "CP3"]] == "Waterway").any(axis=1)
manage["NumCP"] = manage[["CFarm", "Buffer", "Terrace", "Waterway"]].sum(axis=1)
manage["AnyCP"] = manage[["CFarm", "Buffer", "Terrace", "Waterway"]].any(axis=1)

manage.dropna(axis=1, thresh=50, inplace=True)

In [5]:

%matplotlib inline
import matplotlib.pyplot as plt

def boxplot(df:pd.DataFrame, name:str, by="AnyCP", ax=None):
    """A helper function to plot boxplots in log scale and annotate the median"""
    ax = ax or plt.gca()
    _ = df.boxplot(name, by=by, ax=ax)
    ax.set_yscale('log')
    ax.autoscale()
    ax.grid(False)
    medians = df[name].groupby(df[by]).median().values
    ticks = ax.get_xticklabels()
    for median, tick in zip(medians, ticks):
        ax.text(x=tick.get_position()[0], y=median, s=median, horizontalalignment="center")

In [6]:

fig, axes = plt.subplots(2, 2)
fig.set_size_inches(14,14)
boxplot(manage, "AvgTPLoad", ax=axes[0][0])
boxplot(manage, "AvgTNLoad", ax=axes[0][1])
boxplot(manage, "Avg.N.Applied", ax=axes[1][0])
boxplot(manage, "Avg.P.Applied", ax=axes[1][1])
plt.tight_layout()

Causal inference¶

The box-plots above show both the outcome and several features are correlated with the treatment, which indicates that there is some bias in the data. We will now use causallib to estimate the effect of treatment by correcting for these biases.

As features, we use the amount of phosphorus and nitrogen applied, the total amount of water runoff, the soil volume, the phosphosros application method, and the type of land use (LU).

In [7]:

from causallib.estimation import IPW
from sklearn.linear_model import LogisticRegression

In [8]:

lr = LogisticRegression(solver='lbfgs', max_iter=500)
ipw = IPW(lr)

In [9]:

y = np.log(manage["AvgTPLoad"])

a = manage["AnyCP"]

X = pd.DataFrame({"Runoff": np.log(manage["Avg.Runoff"]),
                  "Soil": np.log(manage["Avg.SoilL"] + 1),
                  "PApplied": np.log(manage["Avg.P.Applied"] + 1),
                  "NApplied": np.log(manage["Avg.N.Applied"] + 1)})
X = pd.concat([X, pd.get_dummies(manage["F.AppMethod1"], prefix="Method"), 
               pd.get_dummies(manage["ShortLU"], prefix="LU"),
               pd.get_dummies(manage["Tillage"], prefix="Tillage")], axis=1)
X.fillna(X.median(axis=0), inplace=True)

y = y.dropna()  # remove the data where there is no outcome
a, X = a[y.index], X.loc[y.index]

print(X.shape)
X.head()

(222, 23)

Out[9]:

	Runoff	Soil	PApplied	NApplied	Method_Incorporated	Method_Injected	Method_Surface Applied	Method_Unknown	LU_Alfalfa	LU_Corn	...	LU_Other	LU_Pasture	LU_Peanuts	LU_Rotation	LU_Sorghum	LU_Soybeans	Tillage_Conservation	Tillage_Conventional	Tillage_No Till	Tillage_Pasture
RecordID
260	0.009950	4.529368	3.288402	4.430817	0	0	1	0	1	0	...	0	0	0	0	0	0	0	0	0	1
261	1.615420	4.726502	3.288402	4.430817	0	0	1	0	1	0	...	0	0	0	0	0	0	0	0	0	1
93	5.010635	5.525453	3.806662	3.713572	0	0	0	1	1	0	...	0	0	0	0	0	0	0	0	0	1
68	3.295837	1.609438	3.157000	4.430817	0	0	0	1	1	0	...	0	0	0	0	0	0	0	0	0	1
259	1.308333	6.669498	3.465736	4.430817	0	0	1	0	0	0	...	1	0	0	0	0	0	0	1	0	0

5 rows × 23 columns

In [10]:

ipw.fit(X, a)  # train the propensity model to apply causal inference 

Out[10]:

IPW(truncate_eps=None, use_stabilized=False,
    learner=LogisticRegression(max_iter=500))

In [11]:

pd.DataFrame({"causal": np.exp(ipw.estimate_population_outcome(X, a, y)),
              "marginal": np.exp(y.groupby(a).mean())})

Out[11]:

	causal	marginal
AnyCP
False	0.695621	0.592249
True	0.515510	0.798039

We can see that the causal inference identifies that using any CP will reduce the phosphorus load, while the marginal observation (just checking what happened in treated vs. untreated areas) shows the opposite trend. How can this be?

Evaluation and adjustments¶

One way to understand what is occurring is to evaluate the performance of the propensity model and check how consistent it is. Causallib provides the tools to do this.

In [12]:

from causallib.evaluation import PropensityEvaluator

In [13]:

evaluator = PropensityEvaluator(ipw)
_ = evaluator.evaluate_simple(X, a, y, plots=["roc_curve", "weight_distribution", "calibration", 
                                          "covariate_balance_love"])
plt.tight_layout()

At first glance it might seem that while the model has good predictive performance in the ROC curve (top left). However, the fact that the weighted ROC curve is far from the diagonal hints that the model is in fact not very good at balancing the data. This can also be seen in the covariate balancing plot (bottom right), where we see that the Pasture variable, which was extremely biased in the data, is still quite biased even after weighting. It would make sense, therefore, to stratify the analysis according to it, or to remove it from the analysis completely.

To check whether it is better to stratify or limit our analysis to a specific strata, let's check the overlap between Pasture and the treatment. We may as well look at all the categorical features and check:

In [14]:

pd.concat({col: pd.crosstab(X[col], a) for col in X.select_dtypes('uint8').columns.values}, axis=0)

Out[14]:

	AnyCP	False	True
	Method_Incorporated
Method_Incorporated	0	148	31
Method_Incorporated	1	32	11
Method_Injected	0	171	36
Method_Injected	1	9	6
Method_Surface Applied	0	139	32
Method_Surface Applied	1	41	10
Method_Unknown	0	82	27
Method_Unknown	1	98	15
LU_Alfalfa	0	175	42
LU_Alfalfa	1	5	0
LU_Corn	0	156	24
LU_Corn	1	24	18
LU_Cotton	0	173	42
LU_Cotton	1	7	0
LU_Fallow	0	179	40
LU_Fallow	1	1	2
LU_Oats.Wheat	0	162	39
LU_Oats.Wheat	1	18	3
LU_Other	0	165	42
LU_Other	1	15	0
LU_Pasture	0	99	41
LU_Pasture	1	81	1
LU_Peanuts	0	176	42
LU_Peanuts	1	4	0
LU_Rotation	0	160	30
LU_Rotation	1	20	12
LU_Sorghum	0	175	38
LU_Sorghum	1	5	4
LU_Soybeans	0	180	40
LU_Soybeans	1	0	2
Tillage_Conservation	0	157	24
Tillage_Conservation	1	23	18
Tillage_Conventional	0	120	26
Tillage_Conventional	1	60	16
Tillage_No Till	0	168	35
Tillage_No Till	1	12	7
Tillage_Pasture	0	95	41
Tillage_Pasture	1	85	1

We see that there is only one case of pasture land that was in fact treated. We should limit the analysis only to non-pasture land.

But it's a bit more than that, since treatment was also never observed for Alfalfa, Cotton, Peanuts, or Other. For Other, the difference is quite large while the others might still be attributed to small number fluctuations. While this could be tested using some sort of statistical test, this is not the focus of this example, and we will simply remove all these samples (and then remove the columns to avoid having all-zero columns).

In [15]:

X = X.query('LU_Pasture == 0 & LU_Other == 0 & LU_Cotton ==0 & LU_Alfalfa == 0 & LU_Peanuts ==0 & LU_Soybeans == 0')
X = X.drop(columns=['LU_Pasture', 'LU_Other', 'LU_Cotton', 'LU_Alfalfa', 'LU_Peanuts', 'LU_Soybeans'])
a, y = a[X.index], y[X.index]
print(f'We are left with {len(y)} samples')
ipw.fit(X, a)
pd.DataFrame({"causal": np.exp(ipw.estimate_population_outcome(X, a, y)),
              "marginal": np.exp(y.groupby(a).mean())})

We are left with 107 samples

Out[15]:

	causal	marginal
False	1.450745	1.436078
True	0.698854	0.826860

In [16]:

evaluator = PropensityEvaluator(ipw)
evaluator.evaluate_simple(X, a, y, plots=["roc_curve", "weight_distribution", "calibration", 
                                          "covariate_balance_love"])
plt.tight_layout()

Performance evaluation is much better now:

all the features are sufficiently balanced
the model is as calibrated as can be expected given the small sample size that we now have
the propensity distribution doesn't exhibit any areas with only one treatment group dominsating
the ROC curves behave well.

Most importantly, after limiting the data the causal analysis agrees with the marginal analysis, and only makes a small adjustment to the causal effect size that is estimated.

Summary¶

In summary, the causal analysis allowed us to focus on where the data allows us to check differences between treatment groups. Indeed, unlike the initial analysis, we now understand that in non-pasture lands the conservation procedures of terracing, buffering, waterway, or contour farming actually improve phosphorus retention in the soil and reduce phosphorus leaking into the watershed when growing crops such as oats, wheat, corn, soybean and sorghum. While we can surmise that this may be true for other crops, we simply cannot answer that from the data.

Acknowledgement¶

We would like to thank Biance Zadrozny, Sr. Manager, Spatiotemporal Modeling IBM Research Brazil, for valuable feedback regarding the validity of the analyis and interpretation of the data and the results.