import pandas as pd
import numpy as np
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt 
from statsmodels.stats.weightstats import ttest_ind


def sigmoid(x):
    return 1 / (1+np.exp(-1*x))

def weighted_mean(df,x):
    return (df[x]*df['wgt']).sum()/df['wgt'].sum()


N = 100000
np.random.seed(42)
df = pd.DataFrame()

# Being a mobile user is more likely
df['randint_mobile'] = np.random.uniform(0,1,N)
df['mobile'] = np.where(df['randint_mobile']<.7,1,0)
df.drop(columns=['randint_mobile'],inplace=True)

df['intensity'] = np.random.normal(0,1,N)

# Probability of a bad treatment (encounter a bug) is more likely if mobile user or intensely using site
df['rand_bug'] = np.random.normal(0,1,N)
df['p_bug'] = sigmoid(0.2*df['rand_bug']+0.4*df['mobile']+0.4*df['intensity']-0.5)
df['bug'] = np.where(df['p_bug']>0.5,1,0)

# Conversion Rate probability influenced by bug, whether mobile, and intensity of site use
# Set: running into bug reduces probability of converting by absolute 30%
df['randnum_convert'] = np.random.uniform(0,1,size=(N,1))
df['randnum_convert'] = df['randnum_convert'] + 0.1*df['intensity'] - 0.3*df['bug'] - 0.1*df['mobile'] + 0.2*df['mobile']*df['intensity']
df['convert'] = np.where(df['randnum_convert']>=0.5,1,0)


print(df.groupby('mobile')['bug'].agg('mean').round(3));

mobile
0    0.129
1    0.413
Name: bug, dtype: float64


print(df.groupby(np.where(df['intensity']>df['intensity'].mean(),
                'High Intensity','Low Intensity'))['bug'].agg('mean').round(3));

High Intensity    0.604
Low Intensity     0.053
Name: bug, dtype: float64


print(df.groupby('bug')['convert'].agg('mean').round(3));

bug
0    0.336
1    0.352
Name: convert, dtype: float64


print(df.groupby([np.where(df['intensity']>df['intensity'].mean(),
                 'High Intensity','Low Intensity'),'bug'])['convert'].agg('mean').round(3));

                bug
High Intensity  0      0.533
                1      0.379
Low Intensity   0      0.255
                1      0.043
Name: convert, dtype: float64


# simple comparison of conversion rates by whether run into bug 
# is misleading due to relationship with intensity of site use 
formula = 'convert~bug'
model = smf.ols(formula,data=df).fit()
print(model.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                     23.29
Date:                Sun, 07 Jan 2024   Prob (F-statistic):           1.40e-06
Time:                        18:00:45   Log-Likelihood:                -67276.
No. Observations:              100000   AIC:                         1.346e+05
Df Residuals:                   99998   BIC:                         1.346e+05
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.3365      0.002    183.990      0.000       0.333       0.340
bug            0.0154      0.003      4.826      0.000       0.009       0.022
==============================================================================
Omnibus:                   539009.364   Durbin-Watson:                   1.990
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            17482.937
Skew:                           0.668   Prob(JB):                         0.00
Kurtosis:                       1.447   Cond. No.                         2.41
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


# add linear controls for mobile and intensity: 
# now shows large negative effect for bug but smaller than true effect
formula = 'convert~bug+mobile+intensity'
model = smf.ols(formula,data=df).fit()
print(model.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.130
Model:                            OLS   Adj. R-squared:                  0.130
Method:                 Least Squares   F-statistic:                     4970.
Date:                Sun, 07 Jan 2024   Prob (F-statistic):               0.00
Time:                        18:00:45   Log-Likelihood:                -60338.
No. Observations:              100000   AIC:                         1.207e+05
Df Residuals:                   99996   BIC:                         1.207e+05
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.4899      0.003    187.429      0.000       0.485       0.495
bug           -0.2291      0.004    -55.058      0.000      -0.237      -0.221
mobile        -0.1052      0.003    -32.112      0.000      -0.112      -0.099
intensity      0.2003      0.002    106.497      0.000       0.197       0.204
==============================================================================
Omnibus:                   122772.561   Durbin-Watson:                   1.990
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            10292.067
Skew:                           0.490   Prob(JB):                         0.00
Kurtosis:                       1.771   Cond. No.                         4.35
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


# correct formulation: larger negative impact from bug but still a big under true effect
formula = 'convert~bug+mobile+intensity+mobile*intensity'
model = smf.ols(formula,data=df).fit()
print(model.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.155
Model:                            OLS   Adj. R-squared:                  0.155
Method:                 Least Squares   F-statistic:                     4573.
Date:                Sun, 07 Jan 2024   Prob (F-statistic):               0.00
Time:                        18:00:45   Log-Likelihood:                -58889.
No. Observations:              100000   AIC:                         1.178e+05
Df Residuals:                   99995   BIC:                         1.178e+05
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
====================================================================================
                       coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------------
Intercept            0.4965      0.003    192.520      0.000       0.491       0.502
bug                 -0.2777      0.004    -66.134      0.000      -0.286      -0.269
mobile              -0.0920      0.003    -28.399      0.000      -0.098      -0.086
intensity            0.0969      0.003     36.420      0.000       0.092       0.102
mobile:intensity     0.1678      0.003     54.232      0.000       0.162       0.174
==============================================================================
Omnibus:                   839182.596   Durbin-Watson:                   1.992
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            10382.523
Skew:                           0.433   Prob(JB):                         0.00
Kurtosis:                       1.681   Cond. No.                         4.49
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


propensity_formula = 'bug~mobile+intensity'
propensity_model = smf.logit(propensity_formula,data=df).fit()
print(propensity_model.summary())

Optimization terminated successfully.
         Current function value: 0.269693
         Iterations 8
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                    bug   No. Observations:               100000
Model:                          Logit   Df Residuals:                    99997
Method:                           MLE   Df Model:                            2
Date:                Sun, 07 Jan 2024   Pseudo R-squ.:                  0.5737
Time:                        18:00:45   Log-Likelihood:                -26969.
converged:                       True   LL-Null:                       -63260.
Covariance Type:            nonrobust   LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     -4.5148      0.036   -126.395      0.000      -4.585      -4.445
mobile         3.6253      0.035    104.946      0.000       3.558       3.693
intensity      3.5771      0.025    143.721      0.000       3.528       3.626
==============================================================================


ipw = df.copy()
ipw['p'] = propensity_model.predict(ipw)
ipw['wgt'] = np.where(ipw['bug']==1,1,ipw['p']/(1-ipw['p']))


metrics = list(['mobile','intensity','convert'])
treated = 'outage'

wm = lambda x: np.average(x,weights=ipw.loc[x.index,"wgt"])
unweighted_stats = pd.DataFrame(columns=['metric','bug_mean','no_bug_mean','p-value'])
weighted_stats = pd.DataFrame(columns=['metric','bug_mean','no_bug_mean','p-value'])

for metric in metrics:
    treated = ipw[ipw['bug']==1][metric]
    untreated = ipw[ipw['bug']==0][metric]
    t,p,degf = ttest_ind(treated,untreated)
    unweighted_stats = pd.concat([unweighted_stats,pd.DataFrame({'metric':metric,'bug_mean':treated.mean(),'no_bug_mean':untreated.mean(),'p-value':p},index=[0])],ignore_index=True)
    
    wgt_t = ipw[ipw['bug']==1]["wgt"]
    wgt_u = ipw[ipw['bug']==0]["wgt"]
    tw,pw,degfw = ttest_ind(treated,untreated,weights=(wgt_t,wgt_u))
    weighted_stats = pd.concat([weighted_stats,pd.DataFrame({'metric':metric,'bug_mean':treated.agg(wm),'no_bug_mean':untreated.agg(wm),'p-value':pw},index=[0])],ignore_index=True)
    
    
unweighted_stats['pct_diff'] = 100*(unweighted_stats['bug_mean']-unweighted_stats['no_bug_mean'])/unweighted_stats['no_bug_mean']
weighted_stats['pct_diff'] = 100*(weighted_stats['bug_mean']-weighted_stats['no_bug_mean'])/weighted_stats['no_bug_mean']
print("Unweighted Differences in Means: \n", unweighted_stats.round(3),'\n')
print("Weighted Differences in Means: \n", weighted_stats.round(3))

Unweighted Differences in Means: 
       metric  bug_mean  no_bug_mean  p-value  pct_diff
0     mobile     0.882        0.613      0.0    43.996
1  intensity     0.919       -0.444      0.0  -306.880
2    convert     0.352        0.336      0.0     4.581 

Weighted Differences in Means: 
       metric  bug_mean  no_bug_mean  p-value  pct_diff
0     mobile     0.882        0.881    0.521     0.190
1  intensity     0.919        0.758    0.000    21.169
2    convert     0.352        0.614    0.000   -42.726


# captures effect of bug even without additional controls 
ipw_formula = 'convert~bug'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())

                            WLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.069
Model:                            WLS   Adj. R-squared:                  0.069
Method:                 Least Squares   F-statistic:                     7395.
Date:                Sun, 07 Jan 2024   Prob (F-statistic):               0.00
Time:                        18:00:45   Log-Likelihood:            -1.7546e+05
No. Observations:              100000   AIC:                         3.509e+05
Df Residuals:                   99998   BIC:                         3.509e+05
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.6144      0.002    276.803      0.000       0.610       0.619
bug           -0.2625      0.003    -85.991      0.000      -0.269      -0.257
==============================================================================
Omnibus:                    19933.430   Durbin-Watson:                   1.983
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           413130.355
Skew:                           0.415   Prob(JB):                         0.00
Kurtosis:                      12.923   Cond. No.                         2.69
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


# adding controls doesn't affect the (correct) estimate while improving model fit
ipw_formula = 'convert~bug+mobile+intensity'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())

                            WLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.184
Model:                            WLS   Adj. R-squared:                  0.184
Method:                 Least Squares   F-statistic:                     7497.
Date:                Sun, 07 Jan 2024   Prob (F-statistic):               0.00
Time:                        18:00:45   Log-Likelihood:            -1.6888e+05
No. Observations:              100000   AIC:                         3.378e+05
Df Residuals:                   99996   BIC:                         3.378e+05
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.2392      0.005     44.226      0.000       0.229       0.250
bug           -0.3080      0.003   -106.793      0.000      -0.314      -0.302
mobile         0.1837      0.005     39.355      0.000       0.175       0.193
intensity      0.2816      0.002    118.557      0.000       0.277       0.286
==============================================================================
Omnibus:                    33619.737   Durbin-Watson:                   2.001
Prob(Omnibus):                  0.000   Jarque-Bera (JB):          1153905.222
Skew:                          -0.968   Prob(JB):                         0.00
Kurtosis:                      19.529   Cond. No.                         8.40
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


# adding controls doesn't affect the (correct) estimate while improving model fit
ipw_formula = 'convert~bug+mobile+intensity+mobile*intensity'
ipw_model = smf.wls(ipw_formula,data=ipw,weights=ipw['wgt']).fit()
print(ipw_model.summary())

                            WLS Regression Results                            
==============================================================================
Dep. Variable:                convert   R-squared:                       0.189
Model:                            WLS   Adj. R-squared:                  0.189
Method:                 Least Squares   F-statistic:                     5841.
Date:                Sun, 07 Jan 2024   Prob (F-statistic):               0.00
Time:                        18:00:45   Log-Likelihood:            -1.6853e+05
No. Observations:              100000   AIC:                         3.371e+05
Df Residuals:                   99995   BIC:                         3.371e+05
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
====================================================================================
                       coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------------
Intercept            0.5087      0.011     44.463      0.000       0.486       0.531
bug                 -0.3080      0.003   -107.155      0.000      -0.314      -0.302
mobile              -0.1015      0.012     -8.715      0.000      -0.124      -0.079
intensity            0.0896      0.008     11.829      0.000       0.075       0.104
mobile:intensity     0.2125      0.008     26.704      0.000       0.197       0.228
==============================================================================
Omnibus:                    33355.725   Durbin-Watson:                   2.001
Prob(Omnibus):                  0.000   Jarque-Bera (JB):          1232437.052
Skew:                          -0.933   Prob(JB):                         0.00
Kurtosis:                      20.097   Cond. No.                         25.5
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Background¶

Generate data.¶

Users have different likelihoods of encountering a bug depending on observed characteristics. Those same characteristics also influence conversion rate probabilities, so simple comparisons between users encountering a bug vs not encountering a bug would be biased.¶

We assign the impact of encountering a bug to reduce conversion probability by 30 percentage points¶

Mobile users more likely to run into a bug¶

High intensity users much more likely to run into a bug¶

Users running into a bug have HIGHER conversion rates¶

High intensity users more likely to convert AND more likely to run into bug¶

(Among those with similar intensity: encountering a bug --> LOWER conversion rates)¶

Estimate impact of an outage on conversion rates¶

Without controls, it looks like encountering a bug is positively correlated with conversion rates¶

Adding controls for observed characteristics--assuming we have a proxy for intensity--appropriately reverses the relationship between a bug encounter and converting. However, the estimate effect is smaller than the true effect as we generated¶

The coefficient on 'bug' approaches the true effect (-0.30) as the model estimated gets closer to the true model structure¶

Preparing an Inverse-Probability Weighting Model¶

Step 1: Estimate the probability of encountering a bug conditional on observed characteristics¶

Step 2: Assign weights. If bug, weight=1. If no bug: weight = p/(1-p). This weights the 'no bug' group to look similar to the 'bug' group on observable characteristics¶

After weighting, the 'no bug' group looks much more similar to the 'bug' group; however, there are still differences in average intensity. We will likely still need regression controls to get the appropriate impact estimate.¶

Applying weights, we get an estimate reasonably close to the true effect even without controls¶

Adding controls in the weighted regression yields an impact estimate nearly exactly correct¶

Background¶

Generate data.¶

Users have different likelihoods of encountering a bug depending on observed characteristics. Those same characteristics also influence conversion rate probabilities, so simple comparisons between users encountering a bug vs not encountering a bug would be biased.¶

We assign the impact of encountering a bug to reduce conversion probability by 30 percentage points¶

Mobile users more likely to run into a bug¶

High intensity users much more likely to run into a bug¶

Users running into a bug have HIGHER conversion rates¶

(due to 'intensity' being related to both conversion and encountering a bug)¶

High intensity users more likely to convert AND more likely to run into bug¶

(Among those with similar intensity: encountering a bug --> LOWER conversion rates)¶

Estimate impact of an outage on conversion rates¶

Without controls, it looks like encountering a bug is positively correlated with conversion rates¶

Adding controls for observed characteristics--assuming we have a proxy for intensity--appropriately reverses the relationship between a bug encounter and converting. However, the estimate effect is smaller than the true effect as we generated¶

The coefficient on 'bug' approaches the true effect (-0.30) as the model estimated gets closer to the true model structure¶

Preparing an Inverse-Probability Weighting Model¶

Step 1: Estimate the probability of encountering a bug conditional on observed characteristics¶

Step 2: Assign weights. If bug, weight=1. If no bug: weight = p/(1-p). This weights the 'no bug' group to look similar to the 'bug' group on observable characteristics¶

After weighting, the 'no bug' group looks much more similar to the 'bug' group; however, there are still differences in average intensity. We will likely still need regression controls to get the appropriate impact estimate.¶

Applying weights, we get an estimate reasonably close to the true effect even without controls¶

Adding controls in the weighted regression yields an impact estimate nearly exactly correct¶