Learning fair representations [1] is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes. We will see how to use this algorithm for learning representations that encourage individual fairness and apply them on the Adult dataset.
%matplotlib inline
# Load all necessary packages
import sys
sys.path.append("../")
from aif360.datasets import BinaryLabelDataset
from aif360.datasets import AdultDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.metrics import ClassificationMetric
from aif360.metrics.utils import compute_boolean_conditioning_vector
from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions import load_preproc_data_adult
from aif360.algorithms.preprocessing.lfr import LFR
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from IPython.display import Markdown, display
import matplotlib.pyplot as plt
# Get the dataset and split into train and test
dataset_orig = load_preproc_data_adult()
dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=True)
# print out some labels, names, etc.
display(Markdown("#### Training Dataset shape"))
print(dataset_orig_train.features.shape)
display(Markdown("#### Favorable and unfavorable labels"))
print(dataset_orig_train.favorable_label, dataset_orig_train.unfavorable_label)
display(Markdown("#### Protected attribute names"))
print(dataset_orig_train.protected_attribute_names)
display(Markdown("#### Privileged and unprivileged protected attribute values"))
print(dataset_orig_train.privileged_protected_attributes,
dataset_orig_train.unprivileged_protected_attributes)
display(Markdown("#### Dataset feature names"))
print(dataset_orig_train.feature_names)
(34189, 18)
(1.0, 0.0)
['sex', 'race']
([array([1.]), array([1.])], [array([0.]), array([0.])])
['race', 'sex', 'Age (decade)=10', 'Age (decade)=20', 'Age (decade)=30', 'Age (decade)=40', 'Age (decade)=50', 'Age (decade)=60', 'Age (decade)=>=70', 'Education Years=6', 'Education Years=7', 'Education Years=8', 'Education Years=9', 'Education Years=10', 'Education Years=11', 'Education Years=12', 'Education Years=<6', 'Education Years=>12']
# Metric for the original dataset
privileged_groups = [{'sex': 1.0}]
unprivileged_groups = [{'sex': 0.0}]
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())
Difference in mean outcomes between unprivileged and privileged groups = -0.196576
# Input recontruction quality - Ax
# Fairness constraint - Az
# Output prediction error - Ay
privileged_groups = [{'sex': 1}]
unprivileged_groups = [{'sex': 0}]
TR = LFR(unprivileged_groups = unprivileged_groups, privileged_groups = privileged_groups)
TR = TR.fit(dataset_orig_train)
250 20650.5619275 500 19749.3611659 750 18978.3028853 1000 17995.3008952 1250 16691.6273592 1500 16443.051177 1750 16096.538506 2000 15761.1807924 2250 17279.0619677 2500 15737.4111706 2750 15673.1203091 3000 15576.8576761 3250 15454.6964063 3500 15416.9562705 3750 15316.2346051 4000 15247.5321307 4250 15177.9090497 4500 15138.4650177 4750 15108.6571538 5000 15082.626066
# Transform training data and align features
dataset_transf_train = TR.transform(dataset_orig_train)
from sklearn.metrics import classification_report
thresholds = [0.2, 0.3, 0.35, 0.4, 0.5]
for threshold in thresholds:
# Transform training data and align features
dataset_transf_train = TR.transform(dataset_orig_train, threshold=threshold)
metric_transf_train = BinaryLabelDatasetMetric(dataset_transf_train,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
display(Markdown("#### Transformed training dataset"))
print("Classification threshold = %f" % threshold)
#print(classification_report(dataset_orig_train.labels, dataset_transf_train.labels))
print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_transf_train.mean_difference())
Classification threshold = 0.200000 Difference in mean outcomes between unprivileged and privileged groups = -0.501460
Classification threshold = 0.300000 Difference in mean outcomes between unprivileged and privileged groups = -0.358144
Classification threshold = 0.350000 Difference in mean outcomes between unprivileged and privileged groups = -0.258428
Classification threshold = 0.400000 Difference in mean outcomes between unprivileged and privileged groups = -0.311679
Classification threshold = 0.500000 Difference in mean outcomes between unprivileged and privileged groups = -0.214736
Optimized preprocessing has reduced the disparity in favorable outcomes between the privileged and unprivileged groups (training data).
display(Markdown("#### Individual fairness metrics"))
print("Consistency of labels in transformed training dataset= %f" %metric_transf_train.consistency())
print("Consistency of labels in original training dataset= %f" %metric_orig_train.consistency())
Consistency of labels in transformed training dataset= 1.000000 Consistency of labels in original training dataset= 0.742326
## PCA Analysis of consitency
import pandas as pd
feat_cols = dataset_orig_train.feature_names
orig_df = pd.DataFrame(dataset_orig_train.features,columns=feat_cols)
orig_df['label'] = dataset_orig_train.labels
orig_df['label'] = orig_df['label'].apply(lambda i: str(i))
transf_df = pd.DataFrame(dataset_transf_train.features,columns=feat_cols)
transf_df['label'] = dataset_transf_train.labels
transf_df['label'] = transf_df['label'].apply(lambda i: str(i))
from sklearn.decomposition import PCA
orig_pca = PCA(n_components=3)
orig_pca_result = orig_pca.fit_transform(orig_df[feat_cols].values)
orig_df['pca-one'] = orig_pca_result[:,0]
orig_df['pca-two'] = orig_pca_result[:,1]
orig_df['pca-three'] = orig_pca_result[:,2]
display(Markdown("#### Original training dataset"))
print('Explained variation per principal component:')
print(orig_pca.explained_variance_ratio_)
Explained variation per principal component: [0.15355025 0.14652464 0.12614707]
transf_pca = PCA(n_components=3)
transf_pca_result = transf_pca.fit_transform(transf_df[feat_cols].values)
transf_df['pca-one'] = transf_pca_result[:,0]
transf_df['pca-two'] = transf_pca_result[:,1]
transf_df['pca-three'] = transf_pca_result[:,2]
display(Markdown("#### Transformed training dataset"))
print('Explained variation per principal component:')
print(transf_pca.explained_variance_ratio_)
Explained variation per principal component: [0.63337409 0.3302964 0.03632951]
display(Markdown("#### Testing Dataset shape"))
print(dataset_orig_test.features.shape)
metric_orig_test = BinaryLabelDatasetMetric(dataset_orig_test,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
display(Markdown("#### Original test dataset"))
print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_test.mean_difference())
dataset_transf_test = TR.transform(dataset_orig_test, threshold=threshold)
metric_transf_test = BinaryLabelDatasetMetric(dataset_transf_test,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups)
print("Consistency of labels in tranformed test dataset= %f" %metric_transf_test.consistency())
Consistency of labels in tranformed test dataset= 1.000000
print("Consistency of labels in original test dataset= %f" %metric_orig_test.consistency())
Consistency of labels in original test dataset= 0.738798
def check_algorithm_success():
"""Transformed dataset consistency should be greater than original dataset."""
assert metric_transf_test.consistency() > metric_orig_test.consistency(), "Transformed dataset consistency should be greater than original dataset."
check_algorithm_success()
References:
[1] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, "Learning Fair Representations."
International Conference on Machine Learning, 2013.