Data¶

For the illustration of the group fairness metrics in TrustyAI, two synthetic datasets were created with the same input features and outcome types. The outcome is whether a certain invidual reaches a 50k income threshold by using age, race and gender as categorical inputs and both datasets consist of $N=10000$ data points. The gender values are allocated with a proportion of 20% to gender=0 and 80% to gender=1.

Both datasets have an increasing likelihood (with uniform probability) of having a positive outcome with age, regardless of race or gender. The first dataset, deemed unbiased, simply allocates the income value with an uniform random value, regardless of race or gender. The second dataset, deemed biased, allocates a positive outcome to gender=0 with a lower probability than gender=1.

In [1]:

import pandas as pd

In [2]:

df = pd.read_csv("data/income-unbiased.zip", index_col=False)

In [3]:

df

Out[3]:

	age	race	gender	income
0	13	0	0	0
1	65	7	0	1
2	71	6	1	0
3	38	1	1	1
4	42	0	0	1
...	...	...	...	...
9995	20	5	1	0
9996	34	2	1	0
9997	25	2	1	1
9998	73	5	1	1
9999	58	3	1	1

10000 rows × 4 columns

Demographic Parity¶

Demographic Parity provides a measure of imbalances in positive and negative outcomes between priveleged and unprivileged groups.

Taking the previous data as an example, we would use Demographic Parity metrics to measure if (for instance), the income is predicted to be above or below $50k regardless of race or gender.

Statistical Parity Difference¶

The Statistical Parity Difference (SPD) is the difference in the probability of prediction between the privileged and unprivileged groups. Typically:

$SPD=0$ means that the model is behaving fairly in regards of the selected attribute (e.g. race, gender)
Values between $-0.1<SPD<0.1$ mean that the model is reasonably fair and the score can be attributed to other factors, such as sample size.
An $SPD$ outside this range would be an indicator of an unfair model relatively to the protected attributes.
- A negative value of statistical parity difference indicates that the unprivileged group is at a disadvantage
- A positive value indicates that the privileged group is at a disadvantage.

The formal definition of $SPD$ is

$$ SPD=p(\hat{y}=1|\mathcal{D}_u)-p(\hat{y}=1|\mathcal{D}_p) $$

where $\hat{y}=1$ is the favorable outcome and $\mathcal{D}_u$, $\mathcal{D}_p$ are respectively the privileged and unpriviledge group data.

Unbiased dataset¶

In [4]:

nobias = pd.read_csv("data/income-unbiased.zip", index_col=False)
nobias.groupby(['gender', 'income'])['income'].count()

Out[4]:

gender  income
0       0         1466
        1          548
1       0         5842
        1         2144
Name: income, dtype: int64

In [5]:

nobias.groupby(['gender', 'income'])['income'].count().unstack().plot.bar()

Out[5]:

<AxesSubplot:xlabel='gender'>

In [7]:

from trustyai.metrics.fairness.group import statistical_parity_difference
from trustyai.model import output

nobias_privileged = nobias[nobias.gender == 1]
nobias_unprivileged = nobias[nobias.gender == 0]
favorable = output("income", dtype="number", value=1)
score = statistical_parity_difference(privileged=nobias_privileged,
                                      unprivileged=nobias_unprivileged,
                                      favorable=[favorable])

In [8]:

print(score)

0.0036255104824703954

We can see that the $SPD$ for this dataset is between the $[-0.1, 0.1]$ threshold, which classifies the model as reasonably fair.

Biased dataset¶

In [9]:

bias = pd.read_csv("data/income-biased.zip", index_col=False)
bias.groupby(['gender', 'income'])['income'].count()

Out[9]:

gender  income
0       0         1772
        1          242
1       0         5775
        1         2211
Name: income, dtype: int64

In [10]:

bias.groupby(['gender', 'income'])['income'].count().unstack().plot.bar()

Out[10]:

<AxesSubplot:xlabel='gender'>

In [12]:

bias_privileged = bias[bias.gender == 1]
bias_unprivileged = bias[bias.gender == 0]

score = statistical_parity_difference(privileged=bias_privileged,
                                      unprivileged=bias_unprivileged,
                                      favorable=[favorable])

In [13]:

print(score)

-0.15670061634672994

This dataset, as expected, is outside the $[-0.1, 0.1]$ threshold, which classifies the model as unfair. In addiction, the negative score indicates that the unprivileged group (in our example, gender = 0) is the one in disadvantage for this particular outcome.

Disparate impact ratio¶

Similarly to the Statistical Parity Difference, the Disparate Impact Ratio (DIR) measures imbalances in positive outcome predictions across privliged and unpriviliged groups. Instead of calculating the difference, this metric calculates the ration of such selection rates.Typically:

$DIR=1$ means that the model is fair with regards to the protected attribute.
$0.8<DIR<1.2$ means that the model is reasonably fair.

The formal definition of the Disparate Impact Ratio is:

$$ DIR=\dfrac{p(\hat{y}=1|\mathcal{D}_u)}{p(\hat{y}=1|\mathcal{D}_p)} $$

In [14]:

from trustyai.metrics.fairness.group import disparate_impact_ratio

score = disparate_impact_ratio(privileged=nobias_privileged,
                               unprivileged=nobias_unprivileged,
                                      favorable=[favorable])

In [15]:

print(score)

1.0135043501459928

As with the $SPD$ we can see that the $DIR$ indicates a reasonably fair model (close to $1$) for the unbiased dataset.

In [16]:

score = disparate_impact_ratio(privileged=bias_privileged,
                               unprivileged=bias_unprivileged,
                            favorable=[favorable])

In [17]:

print(score)

0.43400672901628895

And also, as expected, the $DIR$ indicates a biased model for the biased dataset.

Average Odds Difference¶

Average Odds Difference measures the difference between the True Positive Rates ($TPR$) for the privileged and unprivileged groups, and the False Positive Rates ($FPR$) for the same groups. Formally, the definition is:

$$ AOD=\dfrac{(FPR_{u}-FPR_{p})+(TPR_{u}-TPR_{p})}{2} $$

Typically:

A fair model will have $AOD=0$
A positive value indicates the model benefits the unprivileged group
A negative value indicates the model benefits the privileged group

In [18]:

from trustyai.metrics.fairness.group import average_odds_difference

score = average_odds_difference(test=bias,
                                truth=nobias,
                                privilege_columns=["gender"],
                                privilege_values=[1], # privileged gender value, gender = 1
                                positive_class=[1]) # positive class, income = 1

In [19]:

print(score)

-0.23806418646688987

As we can see, the $AOD$ indicates that the privileged group (gender = 1) is at an advantage in this model.

Average Predictive Value Difference¶

The Average Predictive Value Difference (APVD) measures the difference in the average accuracy of predicted values between the privileged and unprivileged groups in a dataset.

In [20]:

from trustyai.metrics.fairness.group import average_predictive_value_difference

score = average_predictive_value_difference(test=bias,
                                truth=nobias,
                                privilege_columns=["gender"],
                                privilege_values=[1],
                                positive_class=[1])
print(score)

-0.04841289822293428

Models¶

Statistical Parity Difference¶

In [48]:

from xgboost import XGBClassifier


def train(dataset):
  df = pd.read_csv(dataset)

  categories = ['race', 'gender', 'income']
  for f in categories:
      df[f] = df[f].astype('category')
  df['age'] = df['age'].astype('int')

  _X = df[["age", "race", "gender"]]
  _y = df.income

  clf = XGBClassifier(objective="binary:logistic", 
                        enable_categorical=True, 
                        use_label_encoder=False,
                        eval_metric='logloss')
  clf.fit(_X, _y)
  return clf

xgb = train("data/income-biased.zip")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[48], line 22
     19   clf.fit(_X, _y)
     20   return clf
---> 22 xgb = train("data/income-biased.zip")

Cell In[48], line 19, in train(dataset)
     13 _y = df.income
     15 clf = XGBClassifier(objective="binary:logistic", 
     16                       enable_categorical=True, 
     17                       use_label_encoder=False,
     18                       eval_metric='logloss')
---> 19 clf.fit(_X, _y)
     20 return clf

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/core.py:436, in _deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
    434 for k, arg in zip(sig.parameters, args):
    435     kwargs[k] = arg
--> 436 return f(**kwargs)

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/sklearn.py:1158, in XGBClassifier.fit(self, X, y, sample_weight, base_margin, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, base_margin_eval_set, feature_weights, callbacks)
   1153 if len(X.shape) != 2:
   1154     # Simply raise an error here since there might be many
   1155     # different ways of reshaping
   1156     raise ValueError("Please reshape the input data X into 2-dimensional matrix.")
-> 1158 train_dmatrix, evals = _wrap_evaluation_matrices(
   1159     missing=self.missing,
   1160     X=X,
   1161     y=y,
   1162     group=None,
   1163     qid=None,
   1164     sample_weight=sample_weight,
   1165     base_margin=base_margin,
   1166     feature_weights=feature_weights,
   1167     eval_set=eval_set,
   1168     sample_weight_eval_set=sample_weight_eval_set,
   1169     base_margin_eval_set=base_margin_eval_set,
   1170     eval_group=None,
   1171     eval_qid=None,
   1172     create_dmatrix=lambda **kwargs: DMatrix(nthread=self.n_jobs, **kwargs),
   1173     label_transform=label_transform,
   1174 )
   1176 self._Booster = train(
   1177     params,
   1178     train_dmatrix,
   (...)
   1187     callbacks=callbacks,
   1188 )
   1190 if not callable(self.objective):

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/sklearn.py:236, in _wrap_evaluation_matrices(missing, X, y, group, qid, sample_weight, base_margin, feature_weights, eval_set, sample_weight_eval_set, base_margin_eval_set, eval_group, eval_qid, create_dmatrix, label_transform)
    216 def _wrap_evaluation_matrices(
    217     missing: float,
    218     X: Any,
   (...)
    231     label_transform: Callable = lambda x: x,
    232 ) -> Tuple[Any, Optional[List[Tuple[Any, str]]]]:
    233     """Convert array_like evaluation matrices into DMatrix.  Perform validation on the way.
    234 
    235     """
--> 236     train_dmatrix = create_dmatrix(
    237         data=X,
    238         label=label_transform(y),
    239         group=group,
    240         qid=qid,
    241         weight=sample_weight,
    242         base_margin=base_margin,
    243         feature_weights=feature_weights,
    244         missing=missing,
    245     )
    247     def validate_or_none(meta: Optional[List], name: str) -> List:
    248         if meta is None:

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/sklearn.py:1172, in XGBClassifier.fit.<locals>.<lambda>(**kwargs)
   1153 if len(X.shape) != 2:
   1154     # Simply raise an error here since there might be many
   1155     # different ways of reshaping
   1156     raise ValueError("Please reshape the input data X into 2-dimensional matrix.")
   1158 train_dmatrix, evals = _wrap_evaluation_matrices(
   1159     missing=self.missing,
   1160     X=X,
   1161     y=y,
   1162     group=None,
   1163     qid=None,
   1164     sample_weight=sample_weight,
   1165     base_margin=base_margin,
   1166     feature_weights=feature_weights,
   1167     eval_set=eval_set,
   1168     sample_weight_eval_set=sample_weight_eval_set,
   1169     base_margin_eval_set=base_margin_eval_set,
   1170     eval_group=None,
   1171     eval_qid=None,
-> 1172     create_dmatrix=lambda **kwargs: DMatrix(nthread=self.n_jobs, **kwargs),
   1173     label_transform=label_transform,
   1174 )
   1176 self._Booster = train(
   1177     params,
   1178     train_dmatrix,
   (...)
   1187     callbacks=callbacks,
   1188 )
   1190 if not callable(self.objective):

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/core.py:436, in _deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
    434 for k, arg in zip(sig.parameters, args):
    435     kwargs[k] = arg
--> 436 return f(**kwargs)

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/core.py:541, in DMatrix.__init__(self, data, label, weight, base_margin, missing, silent, feature_names, feature_types, nthread, group, qid, label_lower_bound, label_upper_bound, feature_weights, enable_categorical)
    537     return
    539 from .data import dispatch_data_backend
--> 541 handle, feature_names, feature_types = dispatch_data_backend(
    542     data,
    543     missing=self.missing,
    544     threads=self.nthread,
    545     feature_names=feature_names,
    546     feature_types=feature_types,
    547     enable_categorical=enable_categorical,
    548 )
    549 assert handle is not None
    550 self.handle = handle

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/data.py:573, in dispatch_data_backend(data, missing, threads, feature_names, feature_types, enable_categorical)
    571     return _from_tuple(data, missing, feature_names, feature_types)
    572 if _is_pandas_df(data):
--> 573     return _from_pandas_df(data, enable_categorical, missing, threads,
    574                            feature_names, feature_types)
    575 if _is_pandas_series(data):
    576     return _from_pandas_series(data, missing, threads, feature_names,
    577                                feature_types)

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/data.py:258, in _from_pandas_df(data, enable_categorical, missing, nthread, feature_names, feature_types)
    256 def _from_pandas_df(data, enable_categorical, missing, nthread,
    257                     feature_names, feature_types):
--> 258     data, feature_names, feature_types = _transform_pandas_df(
    259         data, enable_categorical, feature_names, feature_types)
    260     return _from_numpy_array(data, missing, nthread, feature_names,
    261                              feature_types)

File ~/.virtualenvs/trustyai-explainability-python-examples/lib/python3.10/site-packages/xgboost/data.py:223, in _transform_pandas_df(data, enable_categorical, feature_names, feature_types, meta, meta_type)
    215     bad_fields = [
    216         str(data.columns[i]) for i, dtype in enumerate(data_dtypes)
    217         if dtype.name not in _pandas_dtype_mapper
    218     ]
    220     msg = """DataFrame.dtypes for data must be int, float, bool or categorical.  When
    221             categorical type is supplied, DMatrix parameter
    222             `enable_categorical` must be set to `True`."""
--> 223     raise ValueError(msg + ', '.join(bad_fields))
    225 if feature_names is None and meta is None:
    226     if isinstance(data.columns, MultiIndex):

ValueError: DataFrame.dtypes for data must be int, float, bool or categorical.  When
                categorical type is supplied, DMatrix parameter
                `enable_categorical` must be set to `True`.race, gender

In [46]:

from trustyai.model import Model
from trustyai.metrics.fairness.group import statistical_parity_difference_model

X = nobias[["age", "race", "gender"]]

model = Model(xgb.predict, dataframe_input=True, output_names=["approved"])
score = statistical_parity_difference_model(samples=X,
                                            model=model,
                                            privilege_columns=["gender"],
                                            privilege_values=[1],
                                            favorable=[favorable])
print(score)

-0.06288176602997649

Disparate impact ratio¶

In [43]:

from trustyai.metrics.fairness.group import disparate_impact_ratio_model

score = disparate_impact_ratio_model(samples=X,
                                            model=model,
                                            privilege_columns=["gender"],
                                            privilege_values=[1],
                                            favorable=[favorable])
print(score)

0.03798125763334818

Average Odds Difference¶

In [44]:

from trustyai.metrics.fairness.group import average_odds_difference_model

score = average_odds_difference_model(samples=X,
                                      model=model,
                                      privilege_columns=["gender"],
                                      privilege_values=[1],
                                      positive_class=[1])
print(score)

9.581224702515101e-14

In [ ]: