This tutorial illustrates how bias in advertising (ad) data can be discovered, measured, and mitigated using the AI Fairness 360 (AIF360) toolkit. We use data from advertising, where advertisments are targeted, and the actual and predicted conversions are collected along with additional attributes about each user. A user is considered to have actually converted when they click on an advertisement. This tutorial demonstrates how methods in the AIF360 toolkit can be used to discover biased subgroups, determine the amount of bias, and mitigate this bias.
Additional Notes:
Following steps are performed in this notebook for bias discovery, measurement, and mitigation.
pip install git+https://github.com/Trusted-AI/AIF360
)import pandas as pd
import numpy as np
from pprint import pprint
from aif360.datasets import StandardDataset
from aif360.metrics import ClassificationMetric, BinaryLabelDatasetMetric
from aif360.algorithms.postprocessing import RejectOptionClassification
from aif360.detectors.mdss.ScoringFunctions import Bernoulli
from aif360.detectors.mdss.MDSS import MDSS
from aif360.detectors.mdss.generator import get_random_subset
from IPython.display import Markdown, display
2022-06-15 17:47:54.238032: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-06-15 17:47:54.238086: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
For this exercise, you can download the synthetic dataset from this link and place the data in the same location where this notebook is running.
ad_conversion_dataset = pd.read_csv('ad_campaign_data.csv')
ad_conversion_dataset.head()
religion | politics | college_educated | parents | homeowner | gender | age | income | area | true_conversion | predicted_conversion | predicted_probability | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Unknown | Unknown | 1 | 1 | 1 | Unknown | 55-64 | Unknown | Unknown | 0 | 0 | 0.001351 |
1 | Other | Unknown | 1 | 1 | 1 | Unknown | 55-64 | Unknown | Urban | 0 | 0 | 0.002238 |
2 | Unknown | Unknown | 1 | 1 | 1 | F | 55-64 | Unknown | Unknown | 0 | 0 | 0.002704 |
3 | Unknown | Unknown | 1 | 1 | 1 | F | 55-64 | Unknown | Unknown | 0 | 0 | 0.001967 |
4 | Unknown | Unknown | 1 | 1 | 1 | F | 55-64 | Unknown | Urban | 0 | 0 | 0.001681 |
Here the column true_conversion
indicates whether the user actually clicked on the advertisement, predicted_conversion
indicates predicted conversion by the machine learning model and predicted_probability
is the probability of the user clicking the advertisement according to the model. The predicted probability was thresholded at approximately 0.365 to obtain predicted conversions. This threshold was chosen because it led to parity in actual and predicted conversion rates.
Next we get some summary on the number of true conversions and predicted conversions.
print(f"Number of (instances, attributes) in the dataset = {ad_conversion_dataset.shape}")
Number of (instances, attributes) in the dataset = (1443140, 12)
print(f"Statistics of true conversions (0=no, 1=yes)")
print(ad_conversion_dataset.true_conversion.value_counts())
Statistics of true conversions (0=no, 1=yes) 0 1440766 1 2374 Name: true_conversion, dtype: int64
print(f"Statistics of predicted conversions (0=no, 1=yes)")
print(ad_conversion_dataset.predicted_conversion.value_counts())
Statistics of predicted conversions (0=no, 1=yes) 0 1440773 1 2367 Name: predicted_conversion, dtype: int64
There are approximately 1.4M rows total with only 2.3k true and predicted conversions.
To identify the privileged subgroups in the dataset, we will use the utility - Multi Dimensional Subset Scan(MDSS)
.
Subset Scanning seeks to provide insights at the subset-level of data and models. A brute-force approach is computationally expensive because there are exponentially-many subsets to investigate for bias between the observed and predicted number of conversions (clicks).
We highlight which features (columns) form the search-space of possible subsets.
features_4_scanning = ['college_educated','parents','homeowner','gender','age','income','area','politics','religion']
def print_report(data, subset):
""" Utility function to pretty-print the subsets."""
if subset:
to_choose = ad_conversion_dataset[subset.keys()].isin(subset).all(axis = 1)
df = ad_conversion_dataset[['true_conversion', 'predicted_conversion']][to_choose]
else:
for col in features_4_scanning:
subset[col] = list(ad_conversion_dataset[col].unique())
df = ad_conversion_dataset[['true_conversion', 'predicted_conversion']]
true = df['true_conversion'].sum()
pred = df['predicted_conversion'].sum()
print('\033[1mSubset: \033[0m')
pprint(subset)
print('\033[1mSubset Size: \033[0m', len(df))
print('\033[1mTrue Clicks: \033[0m', true)
print('\033[1mPredicted Clicks: \033[0m', pred)
print()
How do true and predicted clicks vary for randomly selected subsets? Here we select random subsets (that contain at least 10k records) and compare the true and predicted clicks. If the model is moderately accurate than these numbers should be close to each other for most (randomly selected subsets).
For a seed value of 11 we get the subset politics = unknown and religion = other. This group had 357 true and 376 predicted clicks. For a seed value of 55, we similarly get the subset age = 55-64, area = Unknown, and religion = Unknown with 1090 true clicks and 1431 predicted clicks.
Feel free to try your own random seeds.
np.random.seed(11)
random_subset = get_random_subset(ad_conversion_dataset[features_4_scanning], prob = 0.05, min_elements = 10000)
print_report(ad_conversion_dataset, random_subset)
Subset: {'politics': ['Unknown'], 'religion': ['Other']} Subset Size: 214101 True Clicks: 357 Predicted Clicks: 376
Which subset (of the exponentially-many ones to consider) shows the most divergence between the predicted number of clicks and the true clicks? In other words, where was the predictive model most biased?
Bias Scan is designed to efficiently detect this group.
# Bias scan
scoring_function = Bernoulli(direction='negative')
scanner = MDSS(scoring_function)
scanned_subset, _ = scanner.scan(ad_conversion_dataset[features_4_scanning],
expectations = ad_conversion_dataset['predicted_conversion'],
outcomes = ad_conversion_dataset['true_conversion'],
penalty = 1,
num_iters = 1,
verbose = False)
print_report(ad_conversion_dataset, scanned_subset)
Subset: {'area': ['Unknown', 'Urban'], 'homeowner': [0], 'income': ['>100K', 'Unknown']} Subset Size: 153883 True Clicks: 281 Predicted Clicks: 1907
Non-homeowners making more than 100k (or unknown) and living in an urban (or unknown) area are a highly anomalous subset of data identified by scanning.
The predictive model expected 1907 clicks from this group but in reality there were only 281. The model is extremely over-estimating this group.
For simplicity we will proceed with the notebook using homeowner status as the protected feature
print_report(ad_conversion_dataset, {'homeowner':[0]})
print_report(ad_conversion_dataset, {'homeowner':[1]})
Subset: {'homeowner': [0]} Subset Size: 174654 True Clicks: 332 Predicted Clicks: 1944 Subset: {'homeowner': [1]} Subset Size: 1268486 True Clicks: 2042 Predicted Clicks: 423
From this scan, it is evident that users who don't own a home (homeowner=0) constitute the privileged group, so we will take this as a privileged class for our study. Note that privileged group just means that this group has high predictive bias (our terminology) and does not mean socioeconomic privilege in any way. This privileged class was predicted to click more on ads whereas they didn't click on it in reality, so the machine learning model was biased towards this group.
We will write a python function to convert a dataset into a Standard dataset which later will be consumed by AIF360
Here we are studying the bias in the dataset to check whether non-homeowners are targeted (shown an advertisement) more compared to other age groups. Hence, privileged group is non-home owners and users who own a home are considred as unprivileged. The true_conversion
column contains data of whether a user clicked on the advertisement or not, if clicked, the value is 1. If shown but not clicked, value is 0. Other categorical features selected for this study include:
parents
-- whether the user is a parent or notgender
-- Male, Female or Unknown gendercollege_educated
-- whether the user had college education or notarea
-- the area where they live (Rural or Urban or Unknown)income
-- Income >100K, <100K or Unknownhomeowner
-- whether the user owns a home or notage
-- the age group category (18-24, 25-34, 45-54, 55-64, Unknown)def convert_to_standard_dataset(df, target_label_name, scores_name=""):
# List of names corresponding to protected attribute columns in the dataset.
# Note that the terminology "protected attribute" used in AI Fairness 360 to
# divide the dataset into multiple groups for measuring and mitigating
# group-level bias.
protected_attributes=['homeowner']
# columns from the dataset that we want to select for this Bias study
selected_features = ['gender', 'age', 'income', 'area', 'college_educated', 'homeowner',
'parents', 'predicted_probability']
# This privileged class is selected based on MDSS subgroup evaluation.
# in previous steps. In our case non-homeowner (homeowner=0) are considered to
# be privileged and homeowners (homeowner=1) are considered as unprivileged.
privileged_classes = [[0]]
# Label values which are considered favorable are listed. All others are
# unfavorable. Label values are mapped to 1 (favorable) and 0 (unfavorable)
# if they are not already binary and numerical.
favorable_target_label = [1]
# List of column names in the DataFrame which are to be expanded into one-hot vectors.
categorical_features = ['parents','gender','college_educated','area','income', 'age']
# create the `StandardDataset` object
standard_dataset = StandardDataset(df=df, label_name=target_label_name,
favorable_classes=favorable_target_label,
scores_name=scores_name,
protected_attribute_names=protected_attributes,
privileged_classes=privileged_classes,
categorical_features=categorical_features,
features_to_keep=selected_features)
if scores_name=="":
standard_dataset.scores = standard_dataset.labels.copy()
return standard_dataset
Now we will convert our dataset to Standard dataset. StandardDataset
is a base class for every BinaryLabelDataset
provided out of the box by AIF360.
StandardDataset
contains a preprocessing routine which:
# Create two StandardDataset objects - one with true conversions and one with
# predicted conversions.
# First create the predicted dataset
ad_standard_dataset_pred = convert_to_standard_dataset(ad_conversion_dataset,
target_label_name = 'predicted_conversion',
scores_name='predicted_probability')
# Use this to create the original dataset
ad_standard_dataset_orig = ad_standard_dataset_pred.copy()
ad_standard_dataset_orig.labels = ad_conversion_dataset["true_conversion"].values.reshape(-1, 1)
ad_standard_dataset_orig.scores = ad_conversion_dataset["true_conversion"].values.reshape(-1, 1)
Bias on the entire dataset is detected using BinaryLabelDatasetMetric
class for evaluation considering privileged and unprivileged groups. We use the metric Disparate impact ratio which is defined as the ratio of the rate of favorable outcomes for the one group to the rate of favorable results for the other group, the two groups (unprivileged and privileged) predetermined by the evaluator or surfaced by some other method like the Multi-Dimensional Subset Scan(MDSS). When this ratio is observed to be less than 1, the first (unprivileged) group is considered disadvantaged compared to the second group. Similarly, if this ratio is much larger than 1, the first (privileged) group is considered to be at a relative advantage. Depending on the scenario, this ratio can vary widely, say from a value close to 0 to a value much larger than 1. These numbers represent the data or algorithms’ bias towards or against specific groups within an audience and could be due to bias in the training data or some inherent unintended bias in the way the algorithms are designed and optimized.
# After converting dataset to Standard dataset your privileged class will always be 1
# & the others would be 0 . If the column is already binary it doesn't convert to 0 & 1.
privileged_groups= [{'homeowner': 0}]
unprivileged_groups = [{'homeowner': 1}]
metric_orig = BinaryLabelDatasetMetric(ad_standard_dataset_orig,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
print(f"Disparate impact for the original dataset = {metric_orig.disparate_impact():.4f}")
metric_pred = BinaryLabelDatasetMetric(ad_standard_dataset_pred,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
print(f"Disparate impact for the predicted dataset = {metric_pred.disparate_impact():.4f}")
Disparate impact for the original dataset = 0.8469 Disparate impact for the predicted dataset = 0.0300
We see that the disparate impact for the original dataset is somewhat close to 1, whereas for the predicted dataset this is very close to 0, indicating high bias. We will attempt to mitigate this by post-processing the predictions.
We will use Reject Option Classification (ROC) as the debiasing function to mitigate bias. Reject option classification is a postprocessing technique that gives favorable outcomes to unprivileged groups and unfavorable outcomes to privileged groups in a confidence band around the decision boundary with the highest uncertainty. For this, let's first divide the dataset into training, validation and testing partitions. We will fit the ROC object using the validation partition and test the performance on the test partition.
# Split the standard dataset into train, test and validation
# (use the same random seed to ensure instances are split in the same way)
random_seed = 1001
dataset_orig_train, dataset_orig_vt = ad_standard_dataset_orig.split([0.7],
shuffle=True, seed=random_seed)
dataset_orig_valid, dataset_orig_test = dataset_orig_vt.split([0.5],
shuffle=True, seed=random_seed+1)
print(f"Original training dataset shape: {dataset_orig_train.features.shape}")
print(f"Original validation dataset shape: {dataset_orig_valid.features.shape}")
print(f"Original testing dataset shape: {dataset_orig_test.features.shape}")
dataset_pred_train, dataset_pred_vt = ad_standard_dataset_pred.split([0.7],
shuffle=True, seed=random_seed)
dataset_pred_valid, dataset_pred_test = dataset_pred_vt.split([0.5],
shuffle=True, seed=random_seed+1)
print(f"Predicted training shape: {dataset_pred_train.features.shape}")
print(f"Predicted validation shape: {dataset_pred_valid.features.shape}")
print(f"Predicted testing shape: {dataset_pred_test.features.shape}")
Original training dataset shape: (1010197, 19) Original validation dataset shape: (216471, 19) Original testing dataset shape: (216472, 19) Predicted training shape: (1010197, 19) Predicted validation shape: (216471, 19) Predicted testing shape: (216472, 19)
# print out some labels, names, etc. for the predicted dataset
display(Markdown("#### Training Dataset shape"))
print(dataset_pred_train.features.shape)
display(Markdown("#### Favorable and unfavorable labels"))
print(dataset_pred_train.favorable_label, dataset_pred_train.unfavorable_label)
display(Markdown("#### Protected attribute names"))
print(dataset_pred_train.protected_attribute_names)
display(Markdown("#### Privileged and unprivileged protected attribute values"))
print(dataset_pred_train.privileged_protected_attributes,
dataset_pred_train.unprivileged_protected_attributes)
display(Markdown("#### Dataset feature names"))
print(dataset_pred_train.feature_names)
(1010197, 19)
1.0 0.0
['homeowner']
[array([0.])] [array([1.])]
['homeowner', 'college_educated=0', 'college_educated=1', 'parents=0', 'parents=1', 'gender=F', 'gender=M', 'gender=Unknown', 'age=18-24', 'age=25-34', 'age=45-54', 'age=55-64', 'age=Unknown', 'income=<100K', 'income=>100K', 'income=Unknown', 'area=Rural', 'area=Unknown', 'area=Urban']
# Best threshold for classification only (no fairness)
num_thresh = 300
ba_arr = np.zeros(num_thresh)
class_thresh_arr = np.linspace(0.01, 0.99, num_thresh)
for idx, class_thresh in enumerate(class_thresh_arr):
fav_inds = dataset_pred_valid.scores > class_thresh
dataset_pred_valid.labels[fav_inds] = dataset_pred_valid.favorable_label
dataset_pred_valid.labels[~fav_inds] = dataset_pred_valid.unfavorable_label
classified_metric_valid = ClassificationMetric(dataset_orig_valid,
dataset_pred_valid,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
ba_arr[idx] = 0.5*(classified_metric_valid.true_positive_rate()\
+classified_metric_valid.true_negative_rate())
best_ind = np.where(ba_arr == np.max(ba_arr))[0][0]
best_class_thresh = class_thresh_arr[best_ind]
print("Best balanced accuracy (no fairness constraints) = %.4f" % np.max(ba_arr))
print("Optimal classification threshold (no fairness constraints) = %.4f" % best_class_thresh)
Best balanced accuracy (no fairness constraints) = 0.5561 Optimal classification threshold (no fairness constraints) = 0.0100
We estimate the optimal threshold for classification that maximizes balanced accuracy without any fairness constraints using the validation dataset. Now, we will also use the post-processing ROC method with the validation set to mitigate bias. We use the Statistical parity difference
as the metric, but feel free to choose among the other allowed metrics - Average odds difference
and Equal opportunity difference
.
# Metric used (should be one of allowed_metrics)
metric_name = "Statistical parity difference"
# Upper and lower bound on the fairness metric used
metric_ub = 0.05
metric_lb = -0.05
#random seed for calibrated equal odds prediction
np.random.seed(1)
# Verify metric name
allowed_metrics = ["Statistical parity difference",
"Average odds difference",
"Equal opportunity difference"]
if metric_name not in allowed_metrics:
raise ValueError("Metric name should be one of allowed metrics")
# Fit the method
ROC = RejectOptionClassification(unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups,
low_class_thresh=0.01, high_class_thresh=0.99,
num_class_thresh=100, num_ROC_margin=50,
metric_name=metric_name,
metric_ub=metric_ub, metric_lb=metric_lb)
dataset_transf_pred_valid = ROC.fit_predict(dataset_orig_valid, dataset_pred_valid)
print("Optimal classification threshold (with fairness constraints) = %.4f" % ROC.classification_threshold)
print("Optimal ROC margin = %.4f" % ROC.ROC_margin)
Optimal classification threshold (with fairness constraints) = 0.0100 Optimal ROC margin = 0.0055
The ROC method has estimated that the optimal classification threshold is 0.01 and the margin is 0.0055. This means that to mitigate bias, for instances with a predicted_probability
between 0.01-0.0055 and 0.01+0.0055, if they belong to the unprivileged group (homeowner=1
), they will be assigned a favorable outcome (predicted_conversion=1
). However, if they belong to the privileged group (homeowner=0
), they will be assigned an unfavorable outcome (predicted_conversion=0
).
metric_pred_valid_transf = BinaryLabelDatasetMetric(dataset_transf_pred_valid,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
display(Markdown("#### Postprocessed predicted validation dataset"))
print(f"Disparate impact of unprivileged vs privileged groups = {metric_pred_valid_transf.disparate_impact():.4f}")
Disparate impact of unprivileged vs privileged groups = 1.0690
Let's define a convenience function to compute a number of metrics.
# Metrics function
from collections import OrderedDict
from aif360.metrics import ClassificationMetric
def compute_metrics(dataset_true, dataset_pred,
unprivileged_groups, privileged_groups,
disp = True):
""" Compute the key metrics """
classified_metric_pred = ClassificationMetric(dataset_true,
dataset_pred,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
metrics = OrderedDict()
metrics["Balanced accuracy"] = 0.5*(classified_metric_pred.true_positive_rate()+
classified_metric_pred.true_negative_rate())
metrics["Statistical parity difference"] = classified_metric_pred.statistical_parity_difference()
metrics["Disparate impact"] = classified_metric_pred.disparate_impact()
metrics["Average odds difference"] = classified_metric_pred.average_odds_difference()
metrics["Equal opportunity difference"] = classified_metric_pred.equal_opportunity_difference()
metrics["Theil index"] = classified_metric_pred.theil_index()
if disp:
for k in metrics:
print("%s = %.4f" % (k, metrics[k]))
return metrics
Now the metrics can be computed for the raw test dataset, and the post-processed one, to verify the effect of bias mitigation.
# Metrics for the test set
fav_inds = dataset_pred_test.scores > best_class_thresh
dataset_pred_test.labels[fav_inds] = dataset_pred_test.favorable_label
dataset_pred_test.labels[~fav_inds] = dataset_pred_test.unfavorable_label
display(Markdown("#### Test set"))
display(Markdown("##### Raw predictions - No fairness constraints, only maximizing balanced accuracy"))
metric_test_bef = compute_metrics(dataset_orig_test, dataset_pred_test,
unprivileged_groups, privileged_groups)
Once the bias is identified from the test dataset, we will run the predict
function of Reject Option Classifier(ROC)
algorithm to transform the dataset and compute the fairness constraints after bias mitigation.
# Metrics for the transformed test set
dataset_transf_pred_test = ROC.predict(dataset_pred_test)
display(Markdown("#### Test set"))
display(Markdown("##### Transformed predictions - With fairness constraints"))
metric_test_aft = compute_metrics(dataset_orig_test, dataset_transf_pred_test,
unprivileged_groups, privileged_groups)
We see that for virtually no loss in balanced accuracy, the group fairness metrics have improved significantly. Particularly, Statistical parity difference
has come closer to 0 and Disparate impact
has come closer to 1, indicating significant improvement in fairness.
In addition to this post-processing approach, AIF360 also offers several pre-, in-, and post-processing bias mitigation algorithms.
For this study, we used MDSS on the dataset to identify the groups that exhibited significant predictive bias. We discovered that non-homeowners were targeted more for advertisements compared to homeowners and hence that sub group is considered as privileged. After discovering this subgroup we used the disparate impact metric to quantify bias. We further used Reject Option Classifier(ROC)
post-processing algorithm to mitigate bias. For that we divided dataset into three groups - training, validation and test. We fit ROC on validation dataset and used test dataset for testing the performance of mitigation. We observed that for similar balanced accuracy, bias is significantly mitigated.
Other pre- and in-processing approaches from AIF360 can also be considered for bias mitigation based on the particular scenario.