H2O Uplift Distributed Random Forest¶

Author: Veronika Maurerova veronika.maurerova@h2o.ai¶

Modeling Uplift¶

Distributed Uplift Random Forest (Uplift DRF) is a classification tool for modeling uplift - the incremental impact of a treatment. This tool is very useful for example in marketing or in medicine. This machine learning approach is inspired by the A/B testing method.

To model uplift, the analyst needs to collect data specifically - before the experiment, the objects are divided usually into two groups:

treatment group: receive some kind of treatment (for example customer get some type of discount)
control group: is separated from the treatment (customers in this group get no discount).

Then the data are prepared and an analyst can gather information about the response - for example, whether customers bought a product, patients recovered from the disease, or similar.

Uplift approaches¶

There are several approaches to model uplift:

Meta-learner algorithms
Instrumental variables algorithms
Neural-networks-based algorithms
Tree-based algorithms

Tree Based Uplift Algorithm¶

Tree-based algorithm means in every tree, it takes information about treatment/control group assignment and information about response directly into a decision about splitting a node. The uplift score is the criterion to make a decision similar to the Gini coefficient in the standard decision tree.

Uplift metric to decide best split

The goal is to maximize the differences between the class distributions in the treatment and control sets, so the splitting criteria are based on distribution divergences. The distribution divergence is calculated based on the uplift_metric parameter. In H2O-3, three uplift_metric types are supported:

Kullback-Leibler divergence (uplift_metric="KL") - uses logarithms to calculate divergence, asymmetric, widely used, tends to infinity values (if treatment or control group distributions contain zero values).

$ KL(P, Q) = \sum_{i=0}^{N} p_i \log{\frac{p_i}{q_i}}$

Squared Euclidean distance (uplift_metric="euclidean") - symmetric and stable distribution, does not tend to infinity values.

$ E(P, Q) = \sum_{i=0}^{N} (p_i-q_i)^2$

Chi-squared divergence (uplift_metric="chi_squared") - Euclidean divergence normalized by control group distribution. Asymmetric and also tends to infinity values (if control group distribution contains zero values).

$X^2(P, Q) = \sum_{i=0}^{N} \frac{(p_i-q_i)^2}{q_i}$

where:

$P$ is treatment group distribution
$Q$ is control group distribution

In a tree node the result value for a split is sum: $metric(P, Q) + metric(1-P, 1-Q)$.

For the split gain value, the result within the node is normalized using a Gini coefficient (Eclidean or ChiSquared) or entropy (KL) for each distribution before and after the split.

Uplift score in each leaf is calculated as:

$TP = (TY1 + 1) / (T + 2)$
$CP = (CY1 + 1) / (C + 2)$
$uplift\_score = TP - CP $

where:

$T$ how many observations in a leaf are from the treatment group (how many data rows in a leaf have treatment_column label == 1)
$C$ how many observations in a leaf are from the control group (how many data rows in the leaf have treatment_column label == 0)
$TY1$ how many observations in a leaf are from the treatment group and respond to the offer (how many data rows in the leaf have treatment_column label == 1 and response_column label == 1)
$CY1$ how many observations in a leaf are from the control group and respond to the offer (how many data rows in the leaf have treatment_column label == 0 and response_column label == 1)

Note: A higher uplift score means more observations from treatment group respond to the offer than from control group. Which means offered treatment has positive effect. The uplift score can be negative, if more observations from control group respond to the offer without treatment.

Difference between SDT and Uplift DT

H2O Implementation (Major release 3.36)¶

The H2O-3 implementation of Uplift DRF is based on DRF because the principle of training is similar to DRF. It is tree based uplift algorithm. Uplift DRF generates a forest of classification uplift trees, rather than a single classification tree. Each of these trees is a weak learner built on a subset of rows and columns. More trees will reduce the variance. Classification takes the average prediction over all of their trees to make a final prediction.

Currently, in H2O-3 only binomial trees are supported, as well as the uplift curve metric and Area Under Uplift curve (AUUC) metric, normalized AUUC, and the Qini value. We are working on adding also regression trees and more metrics, for example, Qini coefficient, and more.

Start H2O-3¶

In [1]:

import h2o
from h2o.estimators.uplift_random_forest import H2OUpliftRandomForestEstimator

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.style as style

import pandas as pd

versionFromGradle='3.37.0',projectVersion='3.37.0.99999',branch='master',lastCommitHash='a1c95a407aec53a6cbc551484bd02d7d80b3bcb6',gitDescribe='jenkins-master-5950-dirty',compiledOn='2022-09-13 10:48:53',compiledBy='kurkami'

Load data¶

To demonstrate how Uplift DRF works, Criteo dataset is used.

Source:

Diemert Eustache, Betlei Artem} and Renaudin, Christophe and Massih-Reza, Amini, "A Large Scale Benchmark for Uplift Modeling", ACM, Proceedings of the AdKDD and TargetAd Workshop, KDD, London,United Kingdom, August, 20, 2018, https://ailab.criteo.com/criteo-uplift-prediction-dataset/.

Description:

The dataset was created by The Criteo AI Lab
Consists of 13M rows, each one representing a user with 12 features, a treatment indicator and 2 binary labels (visits and conversions).
Positive labels mean the user visited/converted on the advertiser website during the test period (2 weeks).
The global treatment ratio is 84.6%.

Detailed description of the columns:

f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11: feature values (dense, float)
treatment: treatment group (1 = treated, 0 = control)
conversion: whether a conversion occured for this user (binary, label)
visit: whether a visit occured for this user (binary, label)
exposure: treatment effect, whether the user has been effectively exposed (binary)

In [2]:

control_name = "control"
treatment_column = "treatment"
response_column = "visit"
feature_cols = ["f"+str(x) for x in range(0,12)]

df = pd.read_csv("/home/0xdiag/bigdata/server/criteo/criteo-uplift-v2.1.csv")
df.head()

Out[2]:

	f0	f1	f2	f3	f4	f5	f6	f7	f8	f9	f10	f11	treatment
0	12.616365	10.059654	8.976429	4.679882	10.280525	4.115453	0.294443	4.833815	3.955396	13.190056	5.300375	-0.168679	1
1	12.616365	10.059654	9.002689	4.679882	10.280525	4.115453	0.294443	4.833815	3.955396	13.190056	5.300375	-0.168679	1
2	12.616365	10.059654	8.964775	4.679882	10.280525	4.115453	0.294443	4.833815	3.955396	13.190056	5.300375	-0.168679	1
3	12.616365	10.059654	9.002801	4.679882	10.280525	4.115453	0.294443	4.833815	3.955396	13.190056	5.300375	-0.168679	1
4	12.616365	10.059654	9.037999	4.679882	10.280525	4.115453	0.294443	4.833815	3.955396	13.190056	5.300375	-0.168679	1

Prepare data¶

Inspiration from: https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook

To modeling uplift the treatment and control group data have to have similar distribution. In real world usually the control group is smaller than the treatment group. It is also a case of Crieteo dataset and we have to rebalanced the data to have a similar size.

In [3]:

print('Total number of samples: {}'.format(len(df)))
print('The dataset is largely imbalanced: ')
print(df['treatment'].value_counts(normalize = True))
print('Percentage of users that visit: {}%'.format(100*round(df['visit'].mean(),4)))
print('Percentage of users that convert: {}%'.format(100*round(df['conversion'].mean(),4)))
print('Percentage of visitors that convert: {}%'.format(100*round(df[df["visit"]==1]["conversion"].mean(),4)))

Total number of samples: 13979592
The dataset is largely imbalanced: 
1    0.85
0    0.15
Name: treatment, dtype: float64
Percentage of users that visit: 4.7%
Percentage of users that convert: 0.29%
Percentage of visitors that convert: 6.21%

In [4]:

# Print proportion of a binary column
# https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook
def print_proportion(df, column):
    fig = plt.figure(figsize = (10,6))
    target_count = df[column].value_counts()
    print('Class 0:', target_count[0])
    print('Class 1:', target_count[1])
    print('Proportion:', int(round(target_count[1] / target_count[0])), ': 1')
    target_count.plot(kind='bar', title='Treatment Class Distribution', color=['#2077B4', '#FF7F0E'], fontsize = 15)
    plt.xticks(rotation=0) 
    
print_proportion(df, treatment_column)

Class 0: 2096937
Class 1: 11882655
Proportion: 6 : 1

In [5]:

from sklearn.model_selection import train_test_split
train_df, test_df  = train_test_split(df, test_size=0.2, random_state=42, stratify=df['treatment'])
print(train_df.shape)
print(test_df.shape)

(11183673, 16)
(2795919, 16)

In [6]:

del(df)

In [7]:

print_proportion(train_df, treatment_column)

Class 0: 1677550
Class 1: 9506123
Proportion: 6 : 1

In [8]:

# Random Undersampling (finding the majority class and undersampling it)
# https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook
def random_under(df, feature):
    
    target = df[feature].value_counts()
    
    if target.values[0]<target.values[1]:
        under = target.index.values[1]
    
    else: 
        under = target.index.values[0]
        
    df_0 = df[df[feature] != under]
    df_1 = df[df[feature] == under]
    
    df_treatment_under = df_1.sample(len(df_0))
    df_1 = pd.concat([df_treatment_under, df_0], axis=0)
    
    return df_1

In [9]:

train_df = random_under(train_df, treatment_column)

In [10]:

print_proportion(train_df, treatment_column)

Class 0: 1677550
Class 1: 1677550
Proportion: 1 : 1

In [11]:

# method to transfor data for LGWUM method, will be explained later
def target_class_lgwum(df, treatment, target, column_name):
    
    #CN:
    df[column_name] = 0 
    #CR:
    df.loc[(df[treatment] == 0) & (df[target] != 0), column_name] = 1 
    #TN:
    df.loc[(df[treatment] != 0) & (df[target] == 0), column_name] = 2 
    #TR:
    df.loc[(df[treatment] != 0) & (df[target] != 0), column_name] = 3 
    return df

response_column_lgwum = "lqwum_response"
train_df = target_class_lgwum(train_df, treatment_column, response_column, response_column_lgwum)
test_df = target_class_lgwum(test_df, treatment_column, response_column, response_column_lgwum)

Start H2O¶

In [12]:

h2o.init(strict_version_check=False) # max_mem_size=10

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_342"; OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~22.04-b07); OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
  Starting server from /home/kurkami/git/h2o/h2o-3/build/h2o.jar
  Ice root: /tmp/tmp8edjah_q
  JVM stdout: /tmp/tmp8edjah_q/h2o_kurkami_started_from_python.out
  JVM stderr: /tmp/tmp8edjah_q/h2o_kurkami_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
Warning: Version mismatch. H2O is version 3.38.0.99999, but the h2o-python package is version 3.37.0.99999. This is a developer build, please contact your developer.


H2O_cluster_uptime:	01 secs
H2O_cluster_timezone:	America/New_York
H2O_data_parsing_timezone:	UTC
H2O_cluster_version:	3.38.0.99999
H2O_cluster_version_age:	1 hour and 17 minutes
H2O_cluster_name:	H2O_from_python_kurkami_q2bo8y
H2O_cluster_total_nodes:	1
H2O_cluster_free_memory:	8.88 Gb
H2O_cluster_total_cores:	12
H2O_cluster_allowed_cores:	12
H2O_cluster_status:	locked, healthy
H2O_connection_url:	http://127.0.0.1:54321
H2O_connection_proxy:	{"http": null, "https": null}
H2O_internal_security:	False
Python_version:	3.10.4 final

Import data to H2O¶

In [13]:

h2o_train_df = h2o.H2OFrame(train_df)
del(train_df)

h2o_train_df[treatment_column] = h2o_train_df[treatment_column].asfactor()
h2o_train_df[response_column] = h2o_train_df[response_column].asfactor()
h2o_train_df[response_column_lgwum] = h2o_train_df[response_column_lgwum].asfactor()
h2o_train_df = h2o.assign(h2o_train_df, "train_df")

h2o_test_df = h2o.H2OFrame(test_df)
del(test_df)
h2o_test_df[treatment_column] = h2o_test_df[treatment_column].asfactor()
h2o_test_df[response_column] = h2o_test_df[response_column].asfactor()
h2o_test_df[response_column_lgwum] = h2o_test_df[response_column_lgwum].asfactor()
h2o_test_df = h2o.assign(h2o_test_df, "test_df")

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%
Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%

Train H2O UpliftDRF model¶

In [14]:

ntree = 20
max_depth = 15
metric="Euclidean"

h2o_uplift_model = H2OUpliftRandomForestEstimator(
                    ntrees=ntree,
                    max_depth=max_depth,
                    min_rows=30,
                    nbins=1000,
                    sample_rate=0.80,
                    score_each_iteration=True,
                    treatment_column=treatment_column,
                    uplift_metric=metric,
                    auuc_nbins=1000,
                    auuc_type="gain",
                    seed=42)

h2o_uplift_model.train(y=response_column, x=feature_cols, training_frame=h2o_train_df)
h2o_uplift_model

upliftdrf Model Build progress: |████████████████████████████████████████████████| (done) 100%

Out[14]:

H2OUpliftRandomForestEstimator : Uplift Distributed Random Forest
Model Key: UpliftDRF_model_python_1665008325837_1

Model Summary:
	number_of_trees	number_of_internal_trees	model_size_in_bytes	min_depth	max_depth	mean_depth	min_leaves	max_leaves	mean_leaves
	20.0	40.0	175049.0	15.0	15.0	15.0	293.0	394.0	344.5


[tips]
Use `model.show()` for more details.
Use `model.explain()` to inspect the model.
--
Use `h2o.display.toggle_user_tips()` to switch on/off this section.

Predict and plot Uplift Score¶

In [15]:

# Plot uplift score
# source https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook
def plot_uplift_score(uplift_score):
    plt.figure(figsize = (10,6))
    plt.xlim(-.05, .1)
    plt.hist(uplift_score, bins=1000, color=['#2077B4'])
    plt.xlabel('Uplift score')
    plt.ylabel('Number of observations in validation set')

In [16]:

h2o_uplift_pred = h2o_uplift_model.predict(h2o_test_df)
h2o_uplift_pred

upliftdrf prediction progress: |█████████████████████████████████████████████████| (done) 100%

Out[16]:

uplift_predict	p_y1_ct1	p_y1_ct0
0.000505327	0.00145941	0.000954085
0.000505327	0.00145941	0.000954085
0.000505327	0.00145941	0.000954085
0.0328249	0.071748	0.0389231
0.000505327	0.00145941	0.000954085
0.000743987	0.00180947	0.00106548
0.000505327	0.00145941	0.000954085
0.00146267	0.0345459	0.0330832
0.000505327	0.00145941	0.000954085
0.000505327	0.00145941	0.000954085

[2795919 rows x 3 columns]

In [17]:

plot_uplift_score(h2o_uplift_pred['uplift_predict'].as_data_frame().uplift_predict)

Evaluate the model¶

In [18]:

perf_h2o = h2o_uplift_model.model_performance(h2o_test_df)

Area Under Uplift Curve (AUUC) calculation¶

To calculate AUUC for big data, the predictions are binned to histograms. Due to this feature the results should be different compared to exact computation.

To define AUUC, binned predictions are sorted from largest to smallest value. For every group the cumulative sum of observations statistic is calculated. The uplift is defined based on these statistics.

Types of AUUC¶

AUUC type	Formula
Qini	$TY1 - CY1 * \frac{T}{C}$
Lift	$\frac{TY1}{T} - \frac{CY1}{C}$
Gain	$(\frac{TY1}{T} - \frac{CY1}{C}) * (T + C)$

Where:

T how many observations are in the treatment group (how many data rows in the bin have treatment_column label == 1)
C how many observations are in the control group (how many data rows in the bin have treatment_column label == 0)
TY1 how many observations are in the treatment group and respond to the offer (how many data rows in the bin have treatment_column label == 1 and response_column label == 1)
CY1 how many observations are in the control group and respond to the offer (how many data rows in the bin have treatment_column label == 0 and response_column label == 1)

The resulting AUUC value is:

Not normalized.
The result could be a positive or negative number.
Higher number means better model.

More information about normalization is in Normalized AUUC section.

For some observation groups the results should be NaN. In this case, the results from NaN groups are linearly interpolated to calculate AUUC and plot uplift curve.

In [19]:

perf_h2o.auuc_table()

Out[19]:

AUUC table (number of bins: 168): All types of AUUC value
uplift_type	qini	lift	gain
AUUC value	21585.4882157	0.0253586	25328.4067094
AUUC normalized	0.8805330	0.0253586	0.8782349
AUUC random value	15585.3752152	0.0065580	18335.7289893

Cumulative Uplift curve plot¶

To plot the uplift curve, the plot_upliftmethod can be used. There is specific parameter metric which can be "qini", "gain", or "lift". The most popular is the Qini uplift curve which is similar to the ROC curve. The Gain and Lift curves are also known from traditional binomial models.

Depending on these curves, you can decide how many observations (for example customers) from the test dataset you send an offer to get optimal gain.

In [20]:

perf_h2o.plot_uplift(metric="qini")

In [21]:

perf_h2o.plot_uplift(metric="gain")

In [22]:

perf_h2o.plot_uplift(metric="lift")

Qini value and Average Excess Cumulative Uplift (AECU)¶

Qini value is calculated as the difference between the Qini AUUC and area under the random uplift curve (random AUUC). The random AUUC is computed as diagonal from zero to overall gain uplift.

The Qini value can be generalized for all AUUC metric types. So AECU for Qini metric is the same as Qini value, but the AECU can be also calculated for Gain and Lift metric type. These values are stored in aecu_table.

In [23]:

perf_h2o.aecu_table()

Out[23]:

AECU values table: All types of AECU value
uplift_type	qini	lift	gain
AECU value	6000.1130005	0.0188005	6992.6777202

Normalized AUUC¶

To get normalized AUUC, you have to call auuc_normalized method. The normalized AUUC is calculated from uplift values which are normalized by uplift value from maximal treated number of observations. So if you have for example uplift values [10, 20, 30] the normalized uplift is [1/3, 2/3, 1]. If the maximal value is negative, the normalization factor is the absolute value from this number. The normalized AUUC can be again negative and positive and can be outside of (0, 1) interval. The normalized AUUC for auuc_metric="lift" is not defined, so the normalized AUUC = AUUC for this case. Also the plot_uplift with metric="lift" is the same for normalize=False and normalize=True.

In [24]:

perf_h2o.plot_uplift(metric="gain", normalize=True)

In [25]:

perf_h2o.auuc_normalized()

Out[25]:

0.8782349075050278

Scoring histrory and importance of number of trees¶

To speed up the calculation of AUUC, the predictions are binned into quantile histograms. To calculate precision AUUC the more bins the better. The more trees usually produce more various predictions and then the algorithm creates histograms with more bins. So the algorithm needs more iterations to get meaningful AUUC results. You can see in the scoring history table the number of bins as well as the result AUUC. There is also Qini value parameter, which reflects the number of bins and then is a better pointer of the model improvement. In the scoring history table below you can see the algorithm stabilized after building 6 trees. But it depends on data and model settings on how many trees are necessary.

In [26]:

h2o_uplift_model.scoring_history()

Out[26]:

	timestamp	duration	number_of_trees	training_auuc_nbins	training_auuc	training_auuc_normalized	training_qini_value
0	2022-10-05 18:19:39	0.030 sec	0.0	0	NaN	NaN	NaN
1	2022-10-05 18:19:41	2.314 sec	1.0	19	29533.038804	0.833748	128.764676
2	2022-10-05 18:19:47	8.248 sec	2.0	42	26565.114523	0.749961	657.003459
3	2022-10-05 18:19:53	14.429 sec	3.0	72	25346.491362	0.715558	1373.757472
4	2022-10-05 18:20:00	21.083 sec	4.0	112	25200.151904	0.711427	2198.512852
5	2022-10-05 18:20:06	27.502 sec	5.0	161	25482.470008	0.719397	2929.990639
6	2022-10-05 18:20:13	34.462 sec	6.0	228	25976.963749	0.733357	3551.194616
7	2022-10-05 18:20:21	41.969 sec	7.0	305	26589.208458	0.750641	4101.071634
8	2022-10-05 18:20:29	50.484 sec	8.0	388	27380.973123	0.772993	4660.496077
9	2022-10-05 18:20:38	58.928 sec	9.0	479	27672.156777	0.781214	4908.823402
10	2022-10-05 18:20:47	1 min 8.312 sec	10.0	561	28108.498783	0.793532	5200.407577
11	2022-10-05 18:20:57	1 min 18.075 sec	11.0	633	28493.104929	0.804390	5441.356926
12	2022-10-05 18:21:07	1 min 28.003 sec	12.0	714	28828.697460	0.813864	5641.677461
13	2022-10-05 18:21:17	1 min 38.369 sec	13.0	773	29066.787752	0.820586	5783.946316
14	2022-10-05 18:21:28	1 min 48.844 sec	14.0	829	29181.561007	0.823826	5855.454094
15	2022-10-05 18:21:39	1 min 59.705 sec	15.0	884	29331.734639	0.828065	5941.529970
16	2022-10-05 18:21:50	2 min 10.810 sec	16.0	923	29380.753171	0.829449	5973.314654
17	2022-10-05 18:22:01	2 min 22.097 sec	17.0	942	29485.860595	0.832417	6031.958517
18	2022-10-05 18:22:12	2 min 33.476 sec	18.0	954	29638.000589	0.836712	6112.091663
19	2022-10-05 18:22:24	2 min 45.046 sec	19.0	968	29690.000005	0.838180	6139.355404
20	2022-10-05 18:22:36	2 min 57.092 sec	20.0	978	29708.220236	0.838694	6150.467978

Comparasion Tree-based approach and Generalized Weighed Uplift (LGWUM)¶

LGWUM (Kane et al., 2014) is one of several methods available for Uplift Modeling, and uses an approach to Uplift Modelling better known as Class Variable Transformation. LGWUM assumes that positive uplift lies in treating treatment-group responders (TR) and control-group non-responders (CN), whilst avoiding treatment-group non-responders (TN) and control-group responders (CR). This is visually shown as:

𝑈𝑝𝑙𝑖𝑓𝑡 𝐿𝐺𝑊𝑈𝑀 = P(TR)/P(T) + P(CN)/P(C) - P(TN)/P(T) - P(CR)/P(C)

source: https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook

In [27]:

from h2o.estimators.gbm import H2OGradientBoostingEstimator

h2o_gbm_lgwum = H2OGradientBoostingEstimator(ntrees=ntree,
                                       max_depth=max_depth,
                                       min_rows=30,
                                       nbins=1000,
                                       score_each_iteration=False,
                                       seed=42)

h2o_gbm_lgwum.train(y=response_column_lgwum, x=feature_cols, training_frame=h2o_train_df)
h2o_gbm_lgwum

gbm Model Build progress: |██████████████████████████████████████████████████████| (done) 100%

Out[27]:

H2OGradientBoostingEstimator : Gradient Boosting Machine
Model Key: GBM_model_python_1665008325837_24

Model Summary:
	number_of_trees	number_of_internal_trees	model_size_in_bytes	min_depth	max_depth	mean_depth	min_leaves	max_leaves	mean_leaves
	20.0	80.0	4011303.0	15.0	15.0	15.0	1341.0	6950.0	3990.975


[tips]
Use `model.show()` for more details.
Use `model.explain()` to inspect the model.
--
Use `h2o.display.toggle_user_tips()` to switch on/off this section.

In [28]:

uplift_predict_lgwum = h2o_gbm_lgwum.predict(h2o_test_df)

result = uplift_predict_lgwum.as_data_frame()
result.columns = ['predict', 'p_cn', 'p_cr', 'p_tn', 'p_tr']
result['uplift_score'] = result.eval('\
                                p_cn/(p_cn + p_cr) \
                                + p_tr/(p_tn + p_tr) \
                                - p_tn/(p_tn + p_tr) \
                                - p_cr/(p_cn + p_cr)')
result

gbm prediction progress: |███████████████████████████████████████████████████████| (done) 100%

Out[28]:

	predict	p_cn	p_cr	p_tn	p_tr	uplift_score
0	0	0.463186	0.039068	0.458699	0.039047	0.001324
1	0	0.461404	0.039872	0.459065	0.039659	-0.000040
2	0	0.461967	0.039286	0.459431	0.039315	0.000904
3	2	0.394716	0.058261	0.489573	0.057449	-0.047193
4	0	0.463572	0.039038	0.458346	0.039044	0.001654
...	...	...	...	...	...	...
2795914	2	0.272473	0.243491	0.280708	0.203327	-0.103698
2795915	2	0.374056	0.083578	0.433372	0.108994	0.036659
2795916	0	0.463108	0.039059	0.458621	0.039212	0.001971
2795917	0	0.448633	0.056153	0.443269	0.051945	-0.012692
2795918	0	0.463135	0.039064	0.458649	0.039152	0.001727

2795919 rows × 6 columns

In [29]:

plot_uplift_score(result.uplift_score)

In [30]:

lgwum_predict = h2o.H2OFrame(result['uplift_score'].tolist())
perf_lgwum = h2o.make_metrics(lgwum_predict, h2o_test_df[response_column], treatment=h2o_test_df[treatment_column], auuc_type="gain", auuc_nbins=81)
perf_lgwum

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%

Out[30]:

ModelMetricsBinomialUplift: 
** Reported on test data. **

AUUC: 20878.6347962265
AUUC normalized: 0.7239439144140724

AUUC table (number of bins: 79): All types of AUUC value
uplift_type	qini	lift	gain
AUUC value	17782.5699126	0.0240703	20878.6347962
AUUC normalized	0.7254012	0.0240703	0.7239439
AUUC random value	12420.2048084	0.0052262	14612.0004307

Qini value: 5362.365104164344

AECU values table: All types of AECU value
uplift_type	qini	lift	gain
AECU value	5362.3651042	0.0188441	6266.6343655

In [31]:

perf_h2o.plot_uplift(metric="qini")

In [32]:

perf_lgwum.plot_uplift(metric="qini")

Conclusion¶

From the Qini curves, you can see the Uplift DRF algorithm performs better than the LGWUM algorithm. The main reason is, that the split in Uplift DRF can be more precious thanks to information about both treatment and control groups.

Uplift trees modeling sources:¶

N. J. Radcliffe, and P. D. Surry, “Real-World Uplift Modelling withSignificance-Based Uplift Trees”, Stochastic Solutions White Paper, 2011.
P. D. Surry, and N. J. Radcliffe, “Quality measures for uplift models”, 2011.

References¶

P. Rzepakowski, and S. Jaroszewicz, “Decision trees for uplift modeling with single and multiple treatments”, 2012.
Hugh Huyton, “Criteo Uplift Modelling“, 2021, https://www.kaggle.com/code/hughhuyton/criteo-uplift-modelling/notebook.