SHAP explanations¶

In [1]:

import trustyai

trustyai.init()

Simple example¶

We start by defining our black-box model, typically represented by

$$ f(\mathbf{x}) = \mathbf{y} $$

Where $\mathbf{x}=\{x_1, x_2, \dots,x_m\}$ and $\mathbf{y}=\{y_1, y_2, \dots,y_n\}$.

Our example toy model, in this case, takes an all-numerical input $\mathbf{x}$ and return a $\mathbf{y}$ of either true or false if the sum of the $\mathbf{x}$ components is within a threshold $\epsilon$ of a point $\mathbf{C}$, that is:

$$ f(\mathbf{x}, \epsilon, \mathbf{C})=\begin{cases} \text{true},\qquad \text{if}\ \mathbf{C}-\epsilon<\sum_{i=1}^m x_i <\mathbf{C}+\epsilon \\ \text{false},\qquad \text{otherwise} \end{cases} $$

This model is provided in the TestUtils module. We instantiate with a $\mathbf{C}=500$ and $\epsilon=1.0$.

In [2]:

from trustyai.utils import TestUtils

center = 10.0
epsilon = 2.0

model = TestUtils.getSumThresholdModel(center, epsilon)

Next we need to define a goal. If our model is $f(\mathbf{x'})=\mathbf{y'}$ we are then defining our $\mathbf{y'}$ and the counterfactual result will be the $\mathbf{x'}$ which satisfies $f(\mathbf{x'})=\mathbf{y'}$.

We will define our goal as true, that is, the sum is withing the vicinity of a (to be defined) point $\mathbf{C}$. The goal is a list of Output which take the following parameters

The feature name
The feature type
The feature value (wrapped in Value)
A confidence threshold, which we will leave at zero (no threshold)

In [3]:

from trustyai.model import output

decision = "inside"
goal = [output(name=decision, dtype="bool", value=True, score=0.0)]

We will now define our initial features, $\mathbf{x}$. Each feature can be instantiated by using FeatureFactory and in this case we want to use numerical features, so we'll use FeatureFactory.newNumericalFeature.

In [4]:

import random
from trustyai.model import feature

features = [feature(name=f"x{i+1}", dtype="number", value=random.random()*10.0) for i in range(3)]

As we can see, the sum of of the features will not be within $\epsilon$ (1.0) of $\mathbf{C}$ (500.0). As such the model prediction will be false:

In [5]:

feature_sum = 0.0
for f in features:
    value = f.value.as_number()
    print(f"Feature {f.name} has value {value}")
    feature_sum += value
print(f"\nFeatures sum is {feature_sum}")

Feature x1 has value 6.164708056938084
Feature x2 has value 3.8453023806417197
Feature x3 has value 3.6459410618461527

Features sum is 13.655951499425957

We execute the model on the generated input and collect the output

In [6]:

from org.kie.kogito.explainability.model import PredictionInput, PredictionOutput

goals = model.predictAsync([PredictionInput(features)]).get()

In [7]:

background = []
for i in range(10):
    _features = [feature(name=f"x{i+1}", dtype="number", value=random.random()*10.0) for i in range(3)]
    background.append(PredictionInput(_features))

We wrap these quantities in a SimplePrediction:

In [8]:

from trustyai.model import simple_prediction

prediction = simple_prediction(input_features=features, outputs=goals[0].outputs)

We can now instantiate the explainer itself.

In [9]:

from trustyai.explainers import SHAPExplainer

explainer = SHAPExplainer(background=background)

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

We generate the explanation as a dict : decision --> saliency.

In [10]:

explanation = explainer.explain(prediction, model)

We inspect the saliency scores assigned by LIME to each feature

In [11]:

for saliency in explanation.getSaliencies():
    print(saliency)

Saliency{output=Output{value=false, type=boolean, score=-2.6559514994259565, name='inside'}, perFeatureImportance=[FeatureImportance{feature=Feature{name='x1', type=number, value=6.164708056938084}, score=-0.2833333333333333, confidence= +/-0.37307360101101117}, FeatureImportance{feature=Feature{name='x2', type=number, value=3.8453023806417197}, score=-0.033333333333333354, confidence= +/-0.37307360101101117}, FeatureImportance{feature=Feature{name='x3', type=number, value=3.6459410618461527}, score=0.016666666666666663, confidence= +/-0.5276057463131408}]}

Python Models¶

Now let's go over how to use a Python model with TrustyAI. First, let's grab a dataset, we'll use the California Housing dataset from sklearn, which tries to predict the median house value of various California housing districts given a number of different attributes of the district.

After downloading the dataset, we then split it into train and test splits.

In [12]:

from sklearn import datasets
from sklearn.model_selection import train_test_split

X, y = datasets.fetch_california_housing(data_home="data", return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=.8)
print(f"X Train: {X_train.shape}, X Test: {X_test.shape}, Y Train: {y_train.shape}, Y Test: {y_test.shape}")

X Train: (16512, 8), X Test: (4128, 8), Y Train: (16512,), Y Test: (4128,)

Now let's grab our model, just a simple xgboost regressor. We'll then plot its test predictions against the the true test labels, to see how well it does.

In [13]:

import xgboost
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('https://raw.githubusercontent.com/RobGeada/stylelibs/main/material_rh.mplstyle')

# uncomment to train from scratch
# xgb_model = xgboost.XGBRegressor(objective='reg:squarederror', tree_method='gpu_hist')
# xgb_model.fit(X_train, y_train)
# print('Test MSE', xgb_model.score(X_test, y_test))
# xgb_model.save_model("models/california_xgboost")
 
# load and score model
xgb_model = xgboost.XGBRegressor(objective='reg:squarederror')
xgb_model.load_model("models/california_xgboost")
print('Test MSE', xgb_model.score(X_test, y_test))

# grab predictions and find largest error
predictions = xgb_model.predict(X_test)
worst = np.argmax(np.abs(predictions - y_test))

# plot predictions
plt.scatter(y_test, predictions)
plt.scatter(y_test.iloc[worst], predictions[worst], color='r')
plt.plot([0,5], [0,5], color='k')
plt.xlabel("True Value")
plt.ylabel("Predicted Value")
plt.title("XGBoost Predictions, California Housing")
plt.show()

Test MSE 0.9195932918653825

That's pretty decent! Let's grab a point to explain; let's choose that really erroneous point marked in red in the above plot.

In [14]:

point_to_explain = X_test.iloc[worst]
point_to_explain

Out[14]:

MedInc          2.384600
HouseAge       22.000000
AveRooms        5.152866
AveBedrms       1.146497
Population    334.000000
AveOccup        2.127389
Latitude       33.480000
Longitude    -117.660000
Name: 10454, dtype: float64

We'll need to convert it into a Prediction object in order to pass it to the SHAP Explainer.

In [15]:

from trustyai.model import feature, output
features_to_explain = [feature(name=key, dtype='number', value=value) for key, value in point_to_explain.items()]
output_to_explain = output(name='Median House Price', dtype='number', value=predictions[worst])
prediction_to_explain = simple_prediction(input_features=features_to_explain, outputs=[output_to_explain])

We also need to convert our training data into TrustyAI PredictionInputs. This is pretty simple for Pandas DataFrames:

In [16]:

from org.kie.kogito.explainability.model import PredictionInput

X_train_PIs = [PredictionInput([feature(name=key, dtype='number', value=value) for key, value in x.items()]) for _, x in X_train.iterrows()]

Now we can wrap our model into a TrustyAI PredictionProvider. We do this via an ArrowModel, which rapidly speeds up the data transfer between Python and the TrustyAI Java library. To create an ArrowModel, we need to pass it a function that accepts a Pandas DataFrame as input and outputs a Pandas DataFrame or Numpy Array. All sklearn models satisfy this with their predict or predict_proba functions, so this is really easy to do.

We then call the get_as_prediction_provider function on the ArrowModel, to which we pass an example datapoint to use as a template for our data conversions. Make sure this template point has the same schema (i.e., feature names and types) as all the other points you plan on passing to the model!

In [17]:

from trustyai.model import ArrowModel
trustyai_model = ArrowModel(xgb_model.predict).get_as_prediction_provider(X_train_PIs[0])

With our model successfully wrapped, we can create our SHAP explainer. To do this we need to specify a background dataset, a small $(\le100)$ set of representative examples of the model's input. We'll use the first 100 training points as our background dataset.

In [18]:

from trustyai.explainers import SHAPExplainer
explainer = SHAPExplainer(background=X_train_PIs[:100])

We can now produce our explanation:

In [19]:

explanations = explainer.explain(prediction_to_explain, trustyai_model)

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.arrow.memory.util.MemoryUtil (file:/Users/rui/.virtualenvs/trustyai-python-examples/lib/python3.7/site-packages/trustyai/dep/org/apache/arrow/arrow-memory-core/7.0.0/arrow-memory-core-7.0.0.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of org.apache.arrow.memory.util.MemoryUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

Visualizing SHAP Values¶

Now let's visualize the explanation, first as a dataframe:

In [22]:

explanations.as_html()

Out[22]:

Explanation of Median House Price
	Mean Background Value	Feature Value	SHAP Value
Background	-	-	2.373661
MedInc	4.070206	2.384600	-0.482674
HouseAge	28.060000	22.000000	-0.006309
AveRooms	5.325720	5.152866	-0.057444
AveBedrms	1.067250	1.146497	0.046290
Population	1528.370000	334.000000	-0.073580
AveOccup	2.785109	2.127389	0.251778
Latitude	35.876600	33.480000	1.330079
Longitude	-119.893900	-117.660000	-1.225742
Prediction	2.373661	2.156060	2.156060

Feature values in red/green indicate a lower/higher value than the average background value of that feature. SHAP values in red/green indicate a negative/positive contribution to the prediction.

Now let's visualize the explanation as a candlestick plot:

In [24]:

explanations.candlestick_plot() 

In [ ]: