Your website has performance bugs. You expect these have a negative effect on conversion rates, but not sure how bad it is. It would be expensive to fix these bugs, so you decide to invest in a fix only if there is a relatively large decline in conversions if a user runs into a bug.
It turns out that bugs are more common (i) on mobile devices, and (ii) among those using the site with a high "intensity" (e.g. many clicks, scrolls). Note that high-intensity users are both: more likely to convert than low intensity users all else equal, but also more likely to encounter a bug.
A/B testing would give us the most precise impact estimate of encountering a bug, but we don't want to force a bad experience on our users. How do we estimate the impact of encountering the bug on conversion rates?
import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt
from statsmodels.stats.weightstats import ttest_ind
def sigmoid(x):
return 1 / (1+np.exp(-1*x))
def weighted_mean(df,x):
return (df[x]*df['wgt']).sum()/df['wgt'].sum()
N = 100000
np.random.seed(42)
df = pd.DataFrame()
# Being a mobile user is more likely
df['randint_mobile'] = np.random.uniform(0,1,N)
df['mobile'] = np.where(df['randint_mobile']<.7,1,0)
df.drop(columns=['randint_mobile'],inplace=True)
df['intensity'] = np.random.normal(0,1,N)
# Probability of a bad treatment (encounter a bug) is more likely if mobile user or intensely using site
df['rand_bug'] = np.random.normal(0,1,N)
df['p_bug'] = sigmoid(0.2*df['rand_bug']+0.4*df['mobile']+0.4*df['intensity']-0.5)
df['bug'] = np.where(df['p_bug']>0.5,1,0)
# Conversion Rate probability influenced by bug, whether mobile, and intensity of site use
# Set: running into bug reduces probability of converting by absolute 30%
df['randnum_convert'] = np.random.uniform(0,1,size=(N,1))
df['randnum_convert'] = df['randnum_convert'] + 0.1*df['intensity'] - 0.3*df['bug'] - 0.1*df['mobile'] + 0.2*df['mobile']*df['intensity']
df['convert'] = np.where(df['randnum_convert']>=0.5,1,0)
print(df.groupby('mobile')['bug'].agg('mean').round(3));
mobile 0 0.129 1 0.413 Name: bug, dtype: float64
print(df.groupby(np.where(df['intensity']>df['intensity'].mean(),
'High Intensity','Low Intensity'))['bug'].agg('mean').round(3));
High Intensity 0.604 Low Intensity 0.053 Name: bug, dtype: float64
print(df.groupby('bug')['convert'].agg('mean').round(3));
bug 0 0.336 1 0.352 Name: convert, dtype: float64
print(df.groupby([np.where(df['intensity']>df['intensity'].mean(),
'High Intensity','Low Intensity'),'bug'])['convert'].agg('mean').round(3));
bug High Intensity 0 0.533 1 0.379 Low Intensity 0 0.255 1 0.043 Name: convert, dtype: float64
# simple comparison of conversion rates by whether run into bug
# is misleading due to relationship with intensity of site use
formula = 'convert~bug'
model = smf.ols(formula,data=df).fit()
print(model.summary())
OLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.000 Model: OLS Adj. R-squared: 0.000 Method: Least Squares F-statistic: 23.29 Date: Sun, 07 Jan 2024 Prob (F-statistic): 1.40e-06 Time: 18:00:45 Log-Likelihood: -67276. No. Observations: 100000 AIC: 1.346e+05 Df Residuals: 99998 BIC: 1.346e+05 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.3365 0.002 183.990 0.000 0.333 0.340 bug 0.0154 0.003 4.826 0.000 0.009 0.022 ============================================================================== Omnibus: 539009.364 Durbin-Watson: 1.990 Prob(Omnibus): 0.000 Jarque-Bera (JB): 17482.937 Skew: 0.668 Prob(JB): 0.00 Kurtosis: 1.447 Cond. No. 2.41 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# add linear controls for mobile and intensity:
# now shows large negative effect for bug but smaller than true effect
formula = 'convert~bug+mobile+intensity'
model = smf.ols(formula,data=df).fit()
print(model.summary())
OLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.130 Model: OLS Adj. R-squared: 0.130 Method: Least Squares F-statistic: 4970. Date: Sun, 07 Jan 2024 Prob (F-statistic): 0.00 Time: 18:00:45 Log-Likelihood: -60338. No. Observations: 100000 AIC: 1.207e+05 Df Residuals: 99996 BIC: 1.207e+05 Df Model: 3 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.4899 0.003 187.429 0.000 0.485 0.495 bug -0.2291 0.004 -55.058 0.000 -0.237 -0.221 mobile -0.1052 0.003 -32.112 0.000 -0.112 -0.099 intensity 0.2003 0.002 106.497 0.000 0.197 0.204 ============================================================================== Omnibus: 122772.561 Durbin-Watson: 1.990 Prob(Omnibus): 0.000 Jarque-Bera (JB): 10292.067 Skew: 0.490 Prob(JB): 0.00 Kurtosis: 1.771 Cond. No. 4.35 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# correct formulation: larger negative impact from bug but still a big under true effect
formula = 'convert~bug+mobile+intensity+mobile*intensity'
model = smf.ols(formula,data=df).fit()
print(model.summary())
OLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.155 Model: OLS Adj. R-squared: 0.155 Method: Least Squares F-statistic: 4573. Date: Sun, 07 Jan 2024 Prob (F-statistic): 0.00 Time: 18:00:45 Log-Likelihood: -58889. No. Observations: 100000 AIC: 1.178e+05 Df Residuals: 99995 BIC: 1.178e+05 Df Model: 4 Covariance Type: nonrobust ==================================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------------ Intercept 0.4965 0.003 192.520 0.000 0.491 0.502 bug -0.2777 0.004 -66.134 0.000 -0.286 -0.269 mobile -0.0920 0.003 -28.399 0.000 -0.098 -0.086 intensity 0.0969 0.003 36.420 0.000 0.092 0.102 mobile:intensity 0.1678 0.003 54.232 0.000 0.162 0.174 ============================================================================== Omnibus: 839182.596 Durbin-Watson: 1.992 Prob(Omnibus): 0.000 Jarque-Bera (JB): 10382.523 Skew: 0.433 Prob(JB): 0.00 Kurtosis: 1.681 Cond. No. 4.49 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
propensity_formula = 'bug~mobile+intensity'
propensity_model = smf.logit(propensity_formula,data=df).fit()
print(propensity_model.summary())
Optimization terminated successfully. Current function value: 0.269693 Iterations 8 Logit Regression Results ============================================================================== Dep. Variable: bug No. Observations: 100000 Model: Logit Df Residuals: 99997 Method: MLE Df Model: 2 Date: Sun, 07 Jan 2024 Pseudo R-squ.: 0.5737 Time: 18:00:45 Log-Likelihood: -26969. converged: True LL-Null: -63260. Covariance Type: nonrobust LLR p-value: 0.000 ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ Intercept -4.5148 0.036 -126.395 0.000 -4.585 -4.445 mobile 3.6253 0.035 104.946 0.000 3.558 3.693 intensity 3.5771 0.025 143.721 0.000 3.528 3.626 ==============================================================================
ipw = df.copy()
ipw['p'] = propensity_model.predict(ipw)
ipw['wgt'] = np.where(ipw['bug']==1,1,ipw['p']/(1-ipw['p']))
metrics = list(['mobile','intensity','convert'])
treated = 'outage'
wm = lambda x: np.average(x,weights=ipw.loc[x.index,"wgt"])
unweighted_stats = pd.DataFrame(columns=['metric','bug_mean','no_bug_mean','p-value'])
weighted_stats = pd.DataFrame(columns=['metric','bug_mean','no_bug_mean','p-value'])
for metric in metrics:
treated = ipw[ipw['bug']==1][metric]
untreated = ipw[ipw['bug']==0][metric]
t,p,degf = ttest_ind(treated,untreated)
unweighted_stats = pd.concat([unweighted_stats,pd.DataFrame({'metric':metric,'bug_mean':treated.mean(),'no_bug_mean':untreated.mean(),'p-value':p},index=[0])],ignore_index=True)
wgt_t = ipw[ipw['bug']==1]["wgt"]
wgt_u = ipw[ipw['bug']==0]["wgt"]
tw,pw,degfw = ttest_ind(treated,untreated,weights=(wgt_t,wgt_u))
weighted_stats = pd.concat([weighted_stats,pd.DataFrame({'metric':metric,'bug_mean':treated.agg(wm),'no_bug_mean':untreated.agg(wm),'p-value':pw},index=[0])],ignore_index=True)
unweighted_stats['pct_diff'] = 100*(unweighted_stats['bug_mean']-unweighted_stats['no_bug_mean'])/unweighted_stats['no_bug_mean']
weighted_stats['pct_diff'] = 100*(weighted_stats['bug_mean']-weighted_stats['no_bug_mean'])/weighted_stats['no_bug_mean']
print("Unweighted Differences in Means: \n", unweighted_stats.round(3),'\n')
print("Weighted Differences in Means: \n", weighted_stats.round(3))
Unweighted Differences in Means: metric bug_mean no_bug_mean p-value pct_diff 0 mobile 0.882 0.613 0.0 43.996 1 intensity 0.919 -0.444 0.0 -306.880 2 convert 0.352 0.336 0.0 4.581 Weighted Differences in Means: metric bug_mean no_bug_mean p-value pct_diff 0 mobile 0.882 0.881 0.521 0.190 1 intensity 0.919 0.758 0.000 21.169 2 convert 0.352 0.614 0.000 -42.726
# captures effect of bug even without additional controls
ipw_formula = 'convert~bug'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())
WLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.069 Model: WLS Adj. R-squared: 0.069 Method: Least Squares F-statistic: 7395. Date: Sun, 07 Jan 2024 Prob (F-statistic): 0.00 Time: 18:00:45 Log-Likelihood: -1.7546e+05 No. Observations: 100000 AIC: 3.509e+05 Df Residuals: 99998 BIC: 3.509e+05 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.6144 0.002 276.803 0.000 0.610 0.619 bug -0.2625 0.003 -85.991 0.000 -0.269 -0.257 ============================================================================== Omnibus: 19933.430 Durbin-Watson: 1.983 Prob(Omnibus): 0.000 Jarque-Bera (JB): 413130.355 Skew: 0.415 Prob(JB): 0.00 Kurtosis: 12.923 Cond. No. 2.69 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# adding controls doesn't affect the (correct) estimate while improving model fit
ipw_formula = 'convert~bug+mobile+intensity'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())
WLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.184 Model: WLS Adj. R-squared: 0.184 Method: Least Squares F-statistic: 7497. Date: Sun, 07 Jan 2024 Prob (F-statistic): 0.00 Time: 18:00:45 Log-Likelihood: -1.6888e+05 No. Observations: 100000 AIC: 3.378e+05 Df Residuals: 99996 BIC: 3.378e+05 Df Model: 3 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 0.2392 0.005 44.226 0.000 0.229 0.250 bug -0.3080 0.003 -106.793 0.000 -0.314 -0.302 mobile 0.1837 0.005 39.355 0.000 0.175 0.193 intensity 0.2816 0.002 118.557 0.000 0.277 0.286 ============================================================================== Omnibus: 33619.737 Durbin-Watson: 2.001 Prob(Omnibus): 0.000 Jarque-Bera (JB): 1153905.222 Skew: -0.968 Prob(JB): 0.00 Kurtosis: 19.529 Cond. No. 8.40 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# adding controls doesn't affect the (correct) estimate while improving model fit
ipw_formula = 'convert~bug+mobile+intensity+mobile*intensity'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())
WLS Regression Results ============================================================================== Dep. Variable: convert R-squared: 0.189 Model: WLS Adj. R-squared: 0.189 Method: Least Squares F-statistic: 5841. Date: Sun, 07 Jan 2024 Prob (F-statistic): 0.00 Time: 18:00:45 Log-Likelihood: -1.6853e+05 No. Observations: 100000 AIC: 3.371e+05 Df Residuals: 99995 BIC: 3.371e+05 Df Model: 4 Covariance Type: nonrobust ==================================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------------ Intercept 0.5087 0.011 44.463 0.000 0.486 0.531 bug -0.3080 0.003 -107.155 0.000 -0.314 -0.302 mobile -0.1015 0.012 -8.715 0.000 -0.124 -0.079 intensity 0.0896 0.008 11.829 0.000 0.075 0.104 mobile:intensity 0.2125 0.008 26.704 0.000 0.197 0.228 ============================================================================== Omnibus: 33355.725 Durbin-Watson: 2.001 Prob(Omnibus): 0.000 Jarque-Bera (JB): 1232437.052 Skew: -0.933 Prob(JB): 0.00 Kurtosis: 20.097 Cond. No. 25.5 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.