Integrated Gradients is a technique used in image analysis of any kind (in our case, animals) to understand which pixels contribute most to a model's prediction. It calculates pixel-wise attributions, revealing the importance of each part of an image in the classification decision. This helps practitioners, researchers and others alike to:
can pinpoint the visual characteristics that influence a model's classification, such as unique markings or features on animals.
system classified an image in a particular way, aiding in understanding the model's decision-making process.
biased or irrelevant features, which is crucial for mitigating biases and ensuring fairness in the task you are seeking to accomplish.
classification accuracy, and contribute to building better models.
Gradients can be used to educate the public and raise awareness about animal identification and conservation challenges, promoting broader engagement in conservation efforts.
Here are the packages we will be using in this notebook.
matplotlib
numpy
alibi
datasets
tensorflow==2.8
Please note, due to some mismatch between the latest tensorflow image and other tools, you will need to pin the version of tensorflow.
!pip install datasets matplotlib 'alibi[tensorflow]' numpy rich tensorflow==2.8
import numpy as np
from PIL import Image
import tensorflow as tf
import matplotlib.pyplot as plt
from alibi.explainers import IntegratedGradients
from tensorflow.keras.applications.resnet_v2 import ResNet50V2
from alibi.datasets import load_cats
from alibi.utils import visualize_image_attr
print('TF version: ', tf.__version__)
print('Eager execution enabled: ', tf.executing_eagerly())
The data here can be your own, personally curated one. We will first load 4 samples of cats, then move on to the beautiful luna, and finish with different examples.
image_shape = (224, 224, 3)
load_cats??
data, labels = load_cats(target_size=image_shape[:2], return_X_y=True)
labels
print(f'Images shape: {data.shape}')
data = (data / 255).astype('float32')
i = 1
plt.imshow(data[i]);
ResNet50 is a convolutional neural network architecture that is 50 layers deep and is commonly used for image classification tasks. Here are two ways to think of ResNets:
For practitioners:
ResNet50 is a residual neural network first introduced in 2015. It consists of 5 stages stacked together, with each stage having a convolution layer followed by identity mappings that skip over a few convolution layers. This "skip connection" structure allows information to shortcut across layers, avoiding the vanishing gradient problem when training very deep networks. After the convolutions, there is an average pooling layer and fully connected layer for the output. The 50 in ResNet50 refers to it having 50 weight layers. ResNet50 achieved state-of-the-art accuracy on ImageNet classification while being easier to optimize than previous deep models. It is widely used as a powerful pretrained feature extractor for computer vision tasks.
For non-practitioners:
ResNet50 is like a very deep maze (50 layers) that images can go through to be classified into categories like dogs, cats, cars etc. Going through such a deep maze makes it hard for information to flow from the beginning to the end. To solve this, ResNet50 adds shortcut tunnels between some of the layers. So some information can skip ahead instead of getting lost. This allows ResNet50 to be trained very accurately on huge image datasets like ImageNet. The whole network acts like a smart feature extractor that can pick out patterns useful for identifying objects. This knowledge can then be transferred to classify new images by connecting ResNet50 to a simpler network. The shortcut design enables ResNet50 to successfully train and extract powerful features from images despite being super deep.
model = ResNet50V2(weights='imagenet')
Integrated Gradients is a method to explain individual predictions for deep neural networks by attributing importance to input features.
Imagine a neural network that classifies images of animals. We want to explain why it predicted "bird" for a particular photo.
slowly going from gray to the original. 3. At each step, pass the interpolated image into the network to get a prediction. 4. Calculate the gradients of the prediction with respect to the input pixels at each step. The gradients indicate how sensitive the prediction is to changes in each pixel. 5. Integrate the gradients across all the steps. This gives importance scores for each pixel.
Pixels with high integrated gradients contributed significantly to pushing the network from an uninformative baseline to predicting "bird". These pixels are most important for the decision.
An analogy is explaining why a cake tastes sweet. We take small steps adding ingredients to a baseline of an empty bowl:
This reveals sugar as highly important, while flour is less so.
In the first example, the baselines (i.e. the starting points of the path integral) are black images (all pixel values are set to zero). This means that black areas of the image will always have zero attributions. In the second example we consider random uniform noise baselines. The path integral is defined as a straight line from the baseline to the input image. The path is approximated by choosing 50 discrete steps according to the Gauss-Legendre method.
n_steps = 20
method = "gausslegendre"
internal_batch_size = 20
ig = IntegratedGradients(model, n_steps=n_steps, method=method, internal_batch_size=internal_batch_size)
instance = np.expand_dims(data[1], axis=0)
predictions = model(instance).numpy()
predictions.shape
predictions = predictions.argmax(axis=1)
ig.explain??
explanation = ig.explain(
instance, baselines=None, target=predictions
)
# Metadata from the explanation object
explanation.meta
# Data fields from the explanation object
explanation.data.keys()
# Get attributions values from the explanation object
attrs = explanation.attributions[0]
def compare_image(image, attrs):
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10, 5))
visualize_image_attr(
attr=None, original_image=image, method='original_image',
title='Original Image', plt_fig_axis=(fig, ax[0]), use_pyplot=False
);
visualize_image_attr(
attr=attrs.squeeze(), original_image=image, method='blended_heat_map',
sign='all', show_colorbar=True, title='Overlaid Attributions',
plt_fig_axis=(fig, ax[1]), use_pyplot=True
);
compare_image(data[1], attrs)
Here we show the attributions obtained choosing random uniform noise as a baseline. You might notice that the attributions can be considerably different from the previous example, where the black image is taken as a baseline. An extensive discussion about the impact of the baselines on integrated gradients attributions can be found in P. Sturmfels at al., “Visualizing the Impact of Feature Attribution Baselines”.
baselines = np.random.random_sample(instance.shape)
explanation = ig.explain(
instance, baselines=baselines, target=predictions
)
attrs = explanation.attributions[0]
Sample image from the test dataset and its attributions. The attributions are shown by overlaying the attributions values for each pixel to the original image. The attribution value for a pixel is obtained by summing up the attributions values for the three color channels. The attributions are scaled in a $[-1, 1]$ red pixel represents negative attributions, while green pixels represents positive attributions. The original image is shown in gray scale for clarity.
compare_image(data[1], attrs)
img = Image.open('data/images/luna_resized.png')
img
img_array = np.asarray(img)
img = (img_array / 255).astype('float32')
(img[None]).shape
instance = np.expand_dims(img, axis=0)
predictions = model(instance).numpy().argmax(axis=1)
explanation = ig.explain(
instance, baselines=None, target=predictions
)
# Get attributions values from the explanation object
attrs = explanation.attributions[0]
compare_image(img, attrs)
Find pictures of things that you like and see try to evaluate the following.
model still predict the correct class?