Quantus x INVICTA Keynote Series — Explainable AI¶

Author: Anna Hedström, PhD Candidate, TU Berlin and ATB Potsdam

Contact: anna.hedstroem@tu-berlin.de

Venue: INVICTA, Spring School

Date: March 21, 2024

Abstract:

In this tutorial, as a first part, we will take an in-depth look at some of the most recent developments in XAI evaluation and in particular, give a demonstration of how to perform XAI evaluation using the open-source library Quantus. We will answer questions such as:

RQ1: How can we select the explanation method with highest quality, without access to ground truth labels? Quantus (JMLR, 2022)

In the second part of the tutorial, we will define and address the problem of meta-evaluation in XAI (i.e., the process of evaluating the evaluation method itself) which arises as we select and quantitatively compare explanation methods for a given model, dataset and task—where the use of multiple XAI metrics or evaluation techniques oftentimes lead to conflicting results. In this part, we will use the library MetaQuantus to characterise the performance of different XAI metrics and moreover, help select and identify a reliable metric for our chosen explainability context.

RQ2: Without ground truth, how can we identify a reliable estimator of explanation quality? MetaQuantus (TMLR, 2023)

In the third part of the tutorial, we will investigate how different parameters influence the evaluation outcome, i.e., how different explanations methods rank.

RQ3: Without ground truth of evaluation, what parameterisation results in the most reliable estimate of explanation quality?

First, we need to re-load a dataset and model.

Related Papers:

Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond by Hedström et al., 2022
[The Meta-Evaluation Problem in Explainable AI:

Identifying Reliable Estimators with MetaQuantus](https://openreview.net/pdf?id=j3FK00HyfU) by Hedström et al., 2023

Github Repo Tutorials

Instructions: For this exercise, we work in a Google Colab environment. It is also possible to run the notebook using a simple Jupyter Notebook environment.

1) Overview¶

We set out multiple goals for this tutorial (ii) to familiarise the audience with advancements in XAI evaluation (ii) to introduce two open-source software libraries: Quantus and MetaQuantus and (iii) run a real-world image classification example.
To follow the tutorial, some basic knowledge of Explainable AI, especially with respect to local XAI methods is necessary.
For out-of-scope, we do not showcase how Quantus can be used with Tensorflow and with other tasks/ data domains such as time-series, tabular or NLP data. We do not discuss global explanation methods.

2) Installation prerequisites¶

Import the main libraries Quantus and MetaQuantus and some supportive libraries.

In [1]:

!pip install captum torchvision medmnist datasets transformers --quiet

# Get the latest versions of quantus and metaquantus.
!pip install install git+https://github.com/understandable-machine-intelligence-lab/Quantus.git \
git+https://github.com/annahedstroem/MetaQuantus.git --quiet

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 9.1 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510.5/510.5 kB 11.4 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 41.8 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 61.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 63.3 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 2.2 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 3.1 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 13.0 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 30.1 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 8.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 4.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 10.5 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 14.8 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 79.9 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.4/88.4 kB 12.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 17.0 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.1/194.1 kB 22.8 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 19.9 MB/s eta 0:00:00
  Building wheel for fire (setup.py) ... done
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
  Preparing metadata (setup.py) ... done
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 25.8 MB/s eta 0:00:00
  Building wheel for quantus (pyproject.toml) ... done
  Building wheel for metaquantus (setup.py) ... done

Please restart the runtime session after running the above cell.

In [2]:

# Imports.
import quantus
import metaquantus
import glob
import gc
import tqdm
import os
import copy
import warnings
import torch
import torchvision
import captum
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

# Enable GPU.
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Notebook settings.
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)
warnings.simplefilter("ignore", category=FutureWarning)
sns.set()

List all package requirements.

In [3]:

# !pip freeze

3) Preliminaries¶

For the purpose of demonstration, in this exercise, we rely on a image-classification task with PyTorch. That being said, Quantus and MetaQuatus do support other ML frameworks such as Tensorflow and data domains e.g., time-series, tabular data and some NLP support.

3.1 Load data¶

We have prepared a small subset of ImageNet images (to download the full dataset, please find instructions here: https://image-net.org/download.php). The following will load inputs x_batch, labels y_batch and segmentation masks s_batch.

In [4]:

!pip install gdown --quiet
!gdown https://drive.google.com/drive/folders/1ZmIkqvnt8_wXU3dLBWS9i88l-spS_FU1 --folder --quiet
!ls

assets	sample_data

In [6]:

#@title 3.1.1 ImageNet class indices names
%%capture
CLASSES = {0: 'tench, Tinca tinca',
'goldfish, Carassius auratus',
'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
'tiger shark, Galeocerdo cuvieri',
'hammerhead, hammerhead shark',
'electric ray, crampfish, numbfish, torpedo',
'stingray',
'cock',
'hen',
'ostrich, Struthio camelus',
'brambling, Fringilla montifringilla',
'goldfinch, Carduelis carduelis',
'house finch, linnet, Carpodacus mexicanus',
'junco, snowbird',
'indigo bunting, indigo finch, indigo bird, Passerina cyanea',
'robin, American robin, Turdus migratorius',
'bulbul',
'jay',
'magpie',
'chickadee',
'water ouzel, dipper',
'kite',
'bald eagle, American eagle, Haliaeetus leucocephalus',
'vulture',
'great grey owl, great gray owl, Strix nebulosa',
'European fire salamander, Salamandra salamandra',
'common newt, Triturus vulgaris',
'eft',
'spotted salamander, Ambystoma maculatum',
'axolotl, mud puppy, Ambystoma mexicanum',
'bullfrog, Rana catesbeiana',
'tree frog, tree-frog',
'tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui',
'loggerhead, loggerhead turtle, Caretta caretta',
'leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea',
'mud turtle',
'terrapin',
'box turtle, box tortoise',
'banded gecko',
'common iguana, iguana, Iguana iguana',
'American chameleon, anole, Anolis carolinensis',
'whiptail, whiptail lizard',
'agama',
'frilled lizard, Chlamydosaurus kingi',
'alligator lizard',
'Gila monster, Heloderma suspectum',
'green lizard, Lacerta viridis',
'African chameleon, Chamaeleo chamaeleon',
'Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis',
'African crocodile, Nile crocodile, Crocodylus niloticus',
'American alligator, Alligator mississipiensis',
'triceratops',
'thunder snake, worm snake, Carphophis amoenus',
'ringneck snake, ring-necked snake, ring snake',
'hognose snake, puff adder, sand viper',
'green snake, grass snake',
'king snake, kingsnake',
'garter snake, grass snake',
'water snake',
'vine snake',
'night snake, Hypsiglena torquata',
'boa constrictor, Constrictor constrictor',
'rock python, rock snake, Python sebae',
'Indian cobra, Naja naja',
'green mamba',
'sea snake',
'horned viper, cerastes, sand viper, horned asp, Cerastes cornutus',
'diamondback, diamondback rattlesnake, Crotalus adamanteus',
'sidewinder, horned rattlesnake, Crotalus cerastes',
'trilobite',
'harvestman, daddy longlegs, Phalangium opilio',
'scorpion',
'black and gold garden spider, Argiope aurantia',
'barn spider, Araneus cavaticus',
'garden spider, Aranea diademata',
'black widow, Latrodectus mactans',
'tarantula',
'wolf spider, hunting spider',
'tick',
'centipede',
'black grouse',
'ptarmigan',
'ruffed grouse, partridge, Bonasa umbellus',
'prairie chicken, prairie grouse, prairie fowl',
'peacock',
'quail',
'partridge',
'African grey, African gray, Psittacus erithacus',
'macaw',
'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
'lorikeet',
'coucal',
'bee eater',
'hornbill',
'hummingbird',
'jacamar',
'toucan',
'drake',
'red-breasted merganser, Mergus serrator',
'goose',
'black swan, Cygnus atratus',
'tusker',
'echidna, spiny anteater, anteater',
'platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus',
'wallaby, brush kangaroo',
'koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus',
'wombat',
'jellyfish',
'sea anemone, anemone',
'brain coral',
'flatworm, platyhelminth',
'nematode, nematode worm, roundworm',
'conch',
'snail',
'slug',
'sea slug, nudibranch',
'chiton, coat-of-mail shell, sea cradle, polyplacophore',
'chambered nautilus, pearly nautilus, nautilus',
'Dungeness crab, Cancer magister',
'rock crab, Cancer irroratus',
'fiddler crab',
'king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica',
'American lobster, Northern lobster, Maine lobster, Homarus americanus',
'spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish',
'crayfish, crawfish, crawdad, crawdaddy',
'hermit crab',
'isopod',
'white stork, Ciconia ciconia',
'black stork, Ciconia nigra',
'spoonbill',
'flamingo',
'little blue heron, Egretta caerulea',
'American egret, great white heron, Egretta albus',
'bittern',
'crane',
'limpkin, Aramus pictus',
'European gallinule, Porphyrio porphyrio',
'American coot, marsh hen, mud hen, water hen, Fulica americana',
'bustard',
'ruddy turnstone, Arenaria interpres',
'red-backed sandpiper, dunlin, Erolia alpina',
'redshank, Tringa totanus',
'dowitcher',
'oystercatcher, oyster catcher',
'pelican',
'king penguin, Aptenodytes patagonica',
'albatross, mollymawk',
'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus',
'killer whale, killer, orca, grampus, sea wolf, Orcinus orca',
'dugong, Dugong dugon',
'sea lion',
'Chihuahua',
'Japanese spaniel',
'Maltese dog, Maltese terrier, Maltese',
'Pekinese, Pekingese, Peke',
'Shih-Tzu',
'Blenheim spaniel',
'papillon',
'toy terrier',
'Rhodesian ridgeback',
'Afghan hound, Afghan',
'basset, basset hound',
'beagle',
'bloodhound, sleuthhound',
'bluetick',
'black-and-tan coonhound',
'Walker hound, Walker foxhound',
'English foxhound',
'redbone',
'borzoi, Russian wolfhound',
'Irish wolfhound',
'Italian greyhound',
'whippet',
'Ibizan hound, Ibizan Podenco',
'Norwegian elkhound, elkhound',
'otterhound, otter hound',
'Saluki, gazelle hound',
'Scottish deerhound, deerhound',
'Weimaraner',
'Staffordshire bullterrier, Staffordshire bull terrier',
'American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier',
'Bedlington terrier',
'Border terrier',
'Kerry blue terrier',
'Irish terrier',
'Norfolk terrier',
'Norwich terrier',
'Yorkshire terrier',
'wire-haired fox terrier',
'Lakeland terrier',
'Sealyham terrier, Sealyham',
'Airedale, Airedale terrier',
'cairn, cairn terrier',
'Australian terrier',
'Dandie Dinmont, Dandie Dinmont terrier',
'Boston bull, Boston terrier',
'miniature schnauzer',
'giant schnauzer',
'standard schnauzer',
'Scotch terrier, Scottish terrier, Scottie',
'Tibetan terrier, chrysanthemum dog',
'silky terrier, Sydney silky',
'soft-coated wheaten terrier',
'West Highland white terrier',
'Lhasa, Lhasa apso',
'flat-coated retriever',
'curly-coated retriever',
'golden retriever',
'Labrador retriever',
'Chesapeake Bay retriever',
'German short-haired pointer',
'vizsla, Hungarian pointer',
'English setter',
'Irish setter, red setter',
'Gordon setter',
'Brittany spaniel',
'clumber, clumber spaniel',
'English springer, English springer spaniel',
'Welsh springer spaniel',
'cocker spaniel, English cocker spaniel, cocker',
'Sussex spaniel',
'Irish water spaniel',
'kuvasz',
'schipperke',
'groenendael',
'malinois',
'briard',
'kelpie',
'komondor',
'Old English sheepdog, bobtail',
'Shetland sheepdog, Shetland sheep dog, Shetland',
'collie',
'Border collie',
'Bouvier des Flandres, Bouviers des Flandres',
'Rottweiler',
'German shepherd, German shepherd dog, German police dog, alsatian',
'Doberman, Doberman pinscher',
'miniature pinscher',
'Greater Swiss Mountain dog',
'Bernese mountain dog',
'Appenzeller',
'EntleBucher',
'boxer',
'bull mastiff',
'Tibetan mastiff',
'French bulldog',
'Great Dane',
'Saint Bernard, St Bernard',
'Eskimo dog, husky',
'malamute, malemute, Alaskan malamute',
'Siberian husky',
'dalmatian, coach dog, carriage dog',
'affenpinscher, monkey pinscher, monkey dog',
'basenji',
'pug, pug-dog',
'Leonberg',
'Newfoundland, Newfoundland dog',
'Great Pyrenees',
'Samoyed, Samoyede',
'Pomeranian',
'chow, chow chow',
'keeshond',
'Brabancon griffon',
'Pembroke, Pembroke Welsh corgi',
'Cardigan, Cardigan Welsh corgi',
'toy poodle',
'miniature poodle',
'standard poodle',
'Mexican hairless',
'timber wolf, grey wolf, gray wolf, Canis lupus',
'white wolf, Arctic wolf, Canis lupus tundrarum',
'red wolf, maned wolf, Canis rufus, Canis niger',
'coyote, prairie wolf, brush wolf, Canis latrans',
'dingo, warrigal, warragal, Canis dingo',
'dhole, Cuon alpinus',
'African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus',
'hyena, hyaena',
'red fox, Vulpes vulpes',
'kit fox, Vulpes macrotis',
'Arctic fox, white fox, Alopex lagopus',
'grey fox, gray fox, Urocyon cinereoargenteus',
'tabby, tabby cat',
'tiger cat',
'Persian cat',
'Siamese cat, Siamese',
'Egyptian cat',
'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor',
'lynx, catamount',
'leopard, Panthera pardus',
'snow leopard, ounce, Panthera uncia',
'jaguar, panther, Panthera onca, Felis onca',
'lion, king of beasts, Panthera leo',
'tiger, Panthera tigris',
'cheetah, chetah, Acinonyx jubatus',
'brown bear, bruin, Ursus arctos',
'American black bear, black bear, Ursus americanus, Euarctos americanus',
'ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus',
'sloth bear, Melursus ursinus, Ursus ursinus',
'mongoose',
'meerkat, mierkat',
'tiger beetle',
'ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle',
'ground beetle, carabid beetle',
'long-horned beetle, longicorn, longicorn beetle',
'leaf beetle, chrysomelid',
'dung beetle',
'rhinoceros beetle',
'weevil',
'fly',
'bee',
'ant, emmet, pismire',
'grasshopper, hopper',
'cricket',
'walking stick, walkingstick, stick insect',
'cockroach, roach',
'mantis, mantid',
'cicada, cicala',
'leafhopper',
'lacewing, lacewing fly',
"dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk",
'damselfly',
'admiral',
'ringlet, ringlet butterfly',
'monarch, monarch butterfly, milkweed butterfly, Danaus plexippus',
'cabbage butterfly',
'sulphur butterfly, sulfur butterfly',
'lycaenid, lycaenid butterfly',
'starfish, sea star',
'sea urchin',
'sea cucumber, holothurian',
'wood rabbit, cottontail, cottontail rabbit',
'hare',
'Angora, Angora rabbit',
'hamster',
'porcupine, hedgehog',
'fox squirrel, eastern fox squirrel, Sciurus niger',
'marmot',
'beaver',
'guinea pig, Cavia cobaya',
'sorrel',
'zebra',
'hog, pig, grunter, squealer, Sus scrofa',
'wild boar, boar, Sus scrofa',
'warthog',
'hippopotamus, hippo, river horse, Hippopotamus amphibius',
'ox',
'water buffalo, water ox, Asiatic buffalo, Bubalus bubalis',
'bison',
'ram, tup',
'bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis',
'ibex, Capra ibex',
'hartebeest',
'impala, Aepyceros melampus',
'gazelle',
'Arabian camel, dromedary, Camelus dromedarius',
'llama',
'weasel',
'mink',
'polecat, fitch, foulmart, foumart, Mustela putorius',
'black-footed ferret, ferret, Mustela nigripes',
'otter',
'skunk, polecat, wood pussy',
'badger',
'armadillo',
'three-toed sloth, ai, Bradypus tridactylus',
'orangutan, orang, orangutang, Pongo pygmaeus',
'gorilla, Gorilla gorilla',
'chimpanzee, chimp, Pan troglodytes',
'gibbon, Hylobates lar',
'siamang, Hylobates syndactylus, Symphalangus syndactylus',
'guenon, guenon monkey',
'patas, hussar monkey, Erythrocebus patas',
'baboon',
'macaque',
'langur',
'colobus, colobus monkey',
'proboscis monkey, Nasalis larvatus',
'marmoset',
'capuchin, ringtail, Cebus capucinus',
'howler monkey, howler',
'titi, titi monkey',
'spider monkey, Ateles geoffroyi',
'squirrel monkey, Saimiri sciureus',
'Madagascar cat, ring-tailed lemur, Lemur catta',
'indri, indris, Indri indri, Indri brevicaudatus',
'Indian elephant, Elephas maximus',
'African elephant, Loxodonta africana',
'lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens',
'giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca',
'barracouta, snoek',
'eel',
'coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch',
'rock beauty, Holocanthus tricolor',
'anemone fish',
'sturgeon',
'gar, garfish, garpike, billfish, Lepisosteus osseus',
'lionfish',
'puffer, pufferfish, blowfish, globefish',
'abacus',
'abaya',
"academic gown, academic robe, judge's robe",
'accordion, piano accordion, squeeze box',
'acoustic guitar',
'aircraft carrier, carrier, flattop, attack aircraft carrier',
'airliner',
'airship, dirigible',
'altar',
'ambulance',
'amphibian, amphibious vehicle',
'analog clock',
'apiary, bee house',
'apron',
'ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin',
'assault rifle, assault gun',
'backpack, back pack, knapsack, packsack, rucksack, haversack',
'bakery, bakeshop, bakehouse',
'balance beam, beam',
'balloon',
'ballpoint, ballpoint pen, ballpen, Biro',
'Band Aid',
'banjo',
'bannister, banister, balustrade, balusters, handrail',
'barbell',
'barber chair',
'barbershop',
'barn',
'barometer',
'barrel, cask',
'barrow, garden cart, lawn cart, wheelbarrow',
'baseball',
'basketball',
'bassinet',
'bassoon',
'bathing cap, swimming cap',
'bath towel',
'bathtub, bathing tub, bath, tub',
'beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon',
'beacon, lighthouse, beacon light, pharos',
'beaker',
'bearskin, busby, shako',
'beer bottle',
'beer glass',
'bell cote, bell cot',
'bib',
'bicycle-built-for-two, tandem bicycle, tandem',
'bikini, two-piece',
'binder, ring-binder',
'binoculars, field glasses, opera glasses',
'birdhouse',
'boathouse',
'bobsled, bobsleigh, bob',
'bolo tie, bolo, bola tie, bola',
'bonnet, poke bonnet',
'bookcase',
'bookshop, bookstore, bookstall',
'bottlecap',
'bow',
'bow tie, bow-tie, bowtie',
'brass, memorial tablet, plaque',
'brassiere, bra, bandeau',
'breakwater, groin, groyne, mole, bulwark, seawall, jetty',
'breastplate, aegis, egis',
'broom',
'bucket, pail',
'buckle',
'bulletproof vest',
'bullet train, bullet',
'butcher shop, meat market',
'cab, hack, taxi, taxicab',
'caldron, cauldron',
'candle, taper, wax light',
'cannon',
'canoe',
'can opener, tin opener',
'cardigan',
'car mirror',
'carousel, carrousel, merry-go-round, roundabout, whirligig',
"carpenter's kit, tool kit",
'carton',
'car wheel',
'cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM',
'cassette',
'cassette player',
'castle',
'catamaran',
'CD player',
'cello, violoncello',
'cellular telephone, cellular phone, cellphone, cell, mobile phone',
'chain',
'chainlink fence',
'chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour',
'chain saw, chainsaw',
'chest',
'chiffonier, commode',
'chime, bell, gong',
'china cabinet, china closet',
'Christmas stocking',
'church, church building',
'cinema, movie theater, movie theatre, movie house, picture palace',
'cleaver, meat cleaver, chopper',
'cliff dwelling',
'cloak',
'clog, geta, patten, sabot',
'cocktail shaker',
'coffee mug',
'coffeepot',
'coil, spiral, volute, whorl, helix',
'combination lock',
'computer keyboard, keypad',
'confectionery, confectionary, candy store',
'container ship, containership, container vessel',
'convertible',
'corkscrew, bottle screw',
'cornet, horn, trumpet, trump',
'cowboy boot',
'cowboy hat, ten-gallon hat',
'cradle',
'crane',
'crash helmet',
'crate',
'crib, cot',
'Crock Pot',
'croquet ball',
'crutch',
'cuirass',
'dam, dike, dyke',
'desk',
'desktop computer',
'dial telephone, dial phone',
'diaper, nappy, napkin',
'digital clock',
'digital watch',
'dining table, board',
'dishrag, dishcloth',
'dishwasher, dish washer, dishwashing machine',
'disk brake, disc brake',
'dock, dockage, docking facility',
'dogsled, dog sled, dog sleigh',
'dome',
'doormat, welcome mat',
'drilling platform, offshore rig',
'drum, membranophone, tympan',
'drumstick',
'dumbbell',
'Dutch oven',
'electric fan, blower',
'electric guitar',
'electric locomotive',
'entertainment center',
'envelope',
'espresso maker',
'face powder',
'feather boa, boa',
'file, file cabinet, filing cabinet',
'fireboat',
'fire engine, fire truck',
'fire screen, fireguard',
'flagpole, flagstaff',
'flute, transverse flute',
'folding chair',
'football helmet',
'forklift',
'fountain',
'fountain pen',
'four-poster',
'freight car',
'French horn, horn',
'frying pan, frypan, skillet',
'fur coat',
'garbage truck, dustcart',
'gasmask, respirator, gas helmet',
'gas pump, gasoline pump, petrol pump, island dispenser',
'goblet',
'go-kart',
'golf ball',
'golfcart, golf cart',
'gondola',
'gong, tam-tam',
'gown',
'grand piano, grand',
'greenhouse, nursery, glasshouse',
'grille, radiator grille',
'grocery store, grocery, food market, market',
'guillotine',
'hair slide',
'hair spray',
'half track',
'hammer',
'hamper',
'hand blower, blow dryer, blow drier, hair dryer, hair drier',
'hand-held computer, hand-held microcomputer',
'handkerchief, hankie, hanky, hankey',
'hard disc, hard disk, fixed disk',
'harmonica, mouth organ, harp, mouth harp',
'harp',
'harvester, reaper',
'hatchet',
'holster',
'home theater, home theatre',
'honeycomb',
'hook, claw',
'hoopskirt, crinoline',
'horizontal bar, high bar',
'horse cart, horse-cart',
'hourglass',
'iPod',
'iron, smoothing iron',
"jack-o'-lantern",
'jean, blue jean, denim',
'jeep, landrover',
'jersey, T-shirt, tee shirt',
'jigsaw puzzle',
'jinrikisha, ricksha, rickshaw',
'joystick',
'kimono',
'knee pad',
'knot',
'lab coat, laboratory coat',
'ladle',
'lampshade, lamp shade',
'laptop, laptop computer',
'lawn mower, mower',
'lens cap, lens cover',
'letter opener, paper knife, paperknife',
'library',
'lifeboat',
'lighter, light, igniter, ignitor',
'limousine, limo',
'liner, ocean liner',
'lipstick, lip rouge',
'Loafer',
'lotion',
'loudspeaker, speaker, speaker unit, loudspeaker system, speaker system',
"loupe, jeweler's loupe",
'lumbermill, sawmill',
'magnetic compass',
'mailbag, postbag',
'mailbox, letter box',
'maillot',
'maillot, tank suit',
'manhole cover',
'maraca',
'marimba, xylophone',
'mask',
'matchstick',
'maypole',
'maze, labyrinth',
'measuring cup',
'medicine chest, medicine cabinet',
'megalith, megalithic structure',
'microphone, mike',
'microwave, microwave oven',
'military uniform',
'milk can',
'minibus',
'miniskirt, mini',
'minivan',
'missile',
'mitten',
'mixing bowl',
'mobile home, manufactured home',
'Model T',
'modem',
'monastery',
'monitor',
'moped',
'mortar',
'mortarboard',
'mosque',
'mosquito net',
'motor scooter, scooter',
'mountain bike, all-terrain bike, off-roader',
'mountain tent',
'mouse, computer mouse',
'mousetrap',
'moving van',
'muzzle',
'nail',
'neck brace',
'necklace',
'nipple',
'notebook, notebook computer',
'obelisk',
'oboe, hautboy, hautbois',
'ocarina, sweet potato',
'odometer, hodometer, mileometer, milometer',
'oil filter',
'organ, pipe organ',
'oscilloscope, scope, cathode-ray oscilloscope, CRO',
'overskirt',
'oxcart',
'oxygen mask',
'packet',
'paddle, boat paddle',
'paddlewheel, paddle wheel',
'padlock',
'paintbrush',
"pajama, pyjama, pj's, jammies",
'palace',
'panpipe, pandean pipe, syrinx',
'paper towel',
'parachute, chute',
'parallel bars, bars',
'park bench',
'parking meter',
'passenger car, coach, carriage',
'patio, terrace',
'pay-phone, pay-station',
'pedestal, plinth, footstall',
'pencil box, pencil case',
'pencil sharpener',
'perfume, essence',
'Petri dish',
'photocopier',
'pick, plectrum, plectron',
'pickelhaube',
'picket fence, paling',
'pickup, pickup truck',
'pier',
'piggy bank, penny bank',
'pill bottle',
'pillow',
'ping-pong ball',
'pinwheel',
'pirate, pirate ship',
'pitcher, ewer',
"plane, carpenter's plane, woodworking plane",
'planetarium',
'plastic bag',
'plate rack',
'plow, plough',
"plunger, plumber's helper",
'Polaroid camera, Polaroid Land camera',
'pole',
'police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria',
'poncho',
'pool table, billiard table, snooker table',
'pop bottle, soda bottle',
'pot, flowerpot',
"potter's wheel",
'power drill',
'prayer rug, prayer mat',
'printer',
'prison, prison house',
'projectile, missile',
'projector',
'puck, hockey puck',
'punching bag, punch bag, punching ball, punchball',
'purse',
'quill, quill pen',
'quilt, comforter, comfort, puff',
'racer, race car, racing car',
'racket, racquet',
'radiator',
'radio, wireless',
'radio telescope, radio reflector',
'rain barrel',
'recreational vehicle, RV, R.V.',
'reel',
'reflex camera',
'refrigerator, icebox',
'remote control, remote',
'restaurant, eating house, eating place, eatery',
'revolver, six-gun, six-shooter',
'rifle',
'rocking chair, rocker',
'rotisserie',
'rubber eraser, rubber, pencil eraser',
'rugby ball',
'rule, ruler',
'running shoe',
'safe',
'safety pin',
'saltshaker, salt shaker',
'sandal',
'sarong',
'sax, saxophone',
'scabbard',
'scale, weighing machine',
'school bus',
'schooner',
'scoreboard',
'screen, CRT screen',
'screw',
'screwdriver',
'seat belt, seatbelt',
'sewing machine',
'shield, buckler',
'shoe shop, shoe-shop, shoe store',
'shoji',
'shopping basket',
'shopping cart',
'shovel',
'shower cap',
'shower curtain',
'ski',
'ski mask',
'sleeping bag',
'slide rule, slipstick',
'sliding door',
'slot, one-armed bandit',
'snorkel',
'snowmobile',
'snowplow, snowplough',
'soap dispenser',
'soccer ball',
'sock',
'solar dish, solar collector, solar furnace',
'sombrero',
'soup bowl',
'space bar',
'space heater',
'space shuttle',
'spatula',
'speedboat',
"spider web, spider's web",
'spindle',
'sports car, sport car',
'spotlight, spot',
'stage',
'steam locomotive',
'steel arch bridge',
'steel drum',
'stethoscope',
'stole',
'stone wall',
'stopwatch, stop watch',
'stove',
'strainer',
'streetcar, tram, tramcar, trolley, trolley car',
'stretcher',
'studio couch, day bed',
'stupa, tope',
'submarine, pigboat, sub, U-boat',
'suit, suit of clothes',
'sundial',
'sunglass',
'sunglasses, dark glasses, shades',
'sunscreen, sunblock, sun blocker',
'suspension bridge',
'swab, swob, mop',
'sweatshirt',
'swimming trunks, bathing trunks',
'swing',
'switch, electric switch, electrical switch',
'syringe',
'table lamp',
'tank, army tank, armored combat vehicle, armoured combat vehicle',
'tape player',
'teapot',
'teddy, teddy bear',
'television, television system',
'tennis ball',
'thatch, thatched roof',
'theater curtain, theatre curtain',
'thimble',
'thresher, thrasher, threshing machine',
'throne',
'tile roof',
'toaster',
'tobacco shop, tobacconist shop, tobacconist',
'toilet seat',
'torch',
'totem pole',
'tow truck, tow car, wrecker',
'toyshop',
'tractor',
'trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi',
'tray',
'trench coat',
'tricycle, trike, velocipede',
'trimaran',
'tripod',
'triumphal arch',
'trolleybus, trolley coach, trackless trolley',
'trombone',
'tub, vat',
'turnstile',
'typewriter keyboard',
'umbrella',
'unicycle, monocycle',
'upright, upright piano',
'vacuum, vacuum cleaner',
'vase',
'vault',
'velvet',
'vending machine',
'vestment',
'viaduct',
'violin, fiddle',
'volleyball',
'waffle iron',
'wall clock',
'wallet, billfold, notecase, pocketbook',
'wardrobe, closet, press',
'warplane, military plane',
'washbasin, handbasin, washbowl, lavabo, wash-hand basin',
'washer, automatic washer, washing machine',
'water bottle',
'water jug',
'water tower',
'whiskey jug',
'whistle',
'wig',
'window screen',
'window shade',
'Windsor tie',
'wine bottle',
'wing',
'wok',
'wooden spoon',
'wool, woolen, woollen',
'worm fence, snake fence, snake-rail fence, Virginia fence',
'wreck',
'yawl',
'yurt',
'web site, website, internet site, site',
'comic book',
'crossword puzzle, crossword',
'street sign',
'traffic light, traffic signal, stoplight',
'book jacket, dust cover, dust jacket, dust wrapper',
'menu',
'plate',
'guacamole',
'consomme',
'hot pot, hotpot',
'trifle',
'ice cream, icecream',
'ice lolly, lolly, lollipop, popsicle',
'French loaf',
'bagel, beigel',
'pretzel',
'cheeseburger',
'hotdog, hot dog, red hot',
'mashed potato',
'head cabbage',
'broccoli',
'cauliflower',
'zucchini, courgette',
'spaghetti squash',
'acorn squash',
'butternut squash',
'cucumber, cuke',
'artichoke, globe artichoke',
'bell pepper',
'cardoon',
'mushroom',
'Granny Smith',
'strawberry',
'orange',
'lemon',
'fig',
'pineapple, ananas',
'banana',
'jackfruit, jak, jack',
'custard apple',
'pomegranate',
'hay',
'carbonara',
'chocolate sauce, chocolate syrup',
'dough',
'meat loaf, meatloaf',
'pizza, pizza pie',
'potpie',
'burrito',
'red wine',
'espresso',
'cup',
'eggnog',
'alp',
'bubble',
'cliff, drop, drop-off',
'coral reef',
'geyser',
'lakeside, lakeshore',
'promontory, headland, head, foreland',
'sandbar, sand bar',
'seashore, coast, seacoast, sea-coast',
'valley, vale',
'volcano',
'ballplayer, baseball player',
'groom, bridegroom',
'scuba diver',
'rapeseed',
'daisy',
"yellow lady's slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum",
'corn',
'acorn',
'hip, rose hip, rosehip',
'buckeye, horse chestnut, conker',
'coral fungus',
'agaric',
'gyromitra',
'stinkhorn, carrion fungus',
'earthstar',
'hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa',
'bolete',
'ear, spike, capitulum',
'toilet tissue, toilet paper, bathroom tissue'
 }

In [7]:

# Load test data and make loaders.
nr_samples = 15

assets = np.load("assets/imagenet_test_set.npy", allow_pickle=True).item()
x_batch = assets["x_batch"][:nr_samples]
y_batch = assets["y_batch"][:nr_samples]
s_batch = assets["s_batch"].reshape(-1, 1, 224, 224)[:nr_samples]

# Plot some inputs!
nr_images = 5
fig, axes = plt.subplots(nrows=1, ncols=nr_images, figsize=(nr_images*3, int(nr_images*2)))
for i in range(nr_images):
    image = (np.moveaxis(quantus.denormalise(x_batch[i], mean=np.array([0.485, 0.456, 0.406]),
                                             std=np.array([0.229, 0.224, 0.225])), 0, -1) * 255).astype(np.uint8)
    axes[i].imshow(image, vmin=0.0, vmax=1.0, cmap="gray")
    axes[i].title.set_text(f"{CLASSES[y_batch[i]][:15]}")
    axes[i].axis("off")
plt.show()

3.2 Load models¶

Since the focus of this tutorial is not on model training but XAI evalution, we load pre-trained models with torchvision library.

In [8]:

# Pick your model!
models = torchvision.models.list_models(module=torchvision.models)
models

Out[8]:

['alexnet',
 'convnext_base',
 'convnext_large',
 'convnext_small',
 'convnext_tiny',
 'densenet121',
 'densenet161',
 'densenet169',
 'densenet201',
 'efficientnet_b0',
 'efficientnet_b1',
 'efficientnet_b2',
 'efficientnet_b3',
 'efficientnet_b4',
 'efficientnet_b5',
 'efficientnet_b6',
 'efficientnet_b7',
 'efficientnet_v2_l',
 'efficientnet_v2_m',
 'efficientnet_v2_s',
 'googlenet',
 'inception_v3',
 'maxvit_t',
 'mnasnet0_5',
 'mnasnet0_75',
 'mnasnet1_0',
 'mnasnet1_3',
 'mobilenet_v2',
 'mobilenet_v3_large',
 'mobilenet_v3_small',
 'regnet_x_16gf',
 'regnet_x_1_6gf',
 'regnet_x_32gf',
 'regnet_x_3_2gf',
 'regnet_x_400mf',
 'regnet_x_800mf',
 'regnet_x_8gf',
 'regnet_y_128gf',
 'regnet_y_16gf',
 'regnet_y_1_6gf',
 'regnet_y_32gf',
 'regnet_y_3_2gf',
 'regnet_y_400mf',
 'regnet_y_800mf',
 'regnet_y_8gf',
 'resnet101',
 'resnet152',
 'resnet18',
 'resnet34',
 'resnet50',
 'resnext101_32x8d',
 'resnext101_64x4d',
 'resnext50_32x4d',
 'shufflenet_v2_x0_5',
 'shufflenet_v2_x1_0',
 'shufflenet_v2_x1_5',
 'shufflenet_v2_x2_0',
 'squeezenet1_0',
 'squeezenet1_1',
 'swin_b',
 'swin_s',
 'swin_t',
 'swin_v2_b',
 'swin_v2_s',
 'swin_v2_t',
 'vgg11',
 'vgg11_bn',
 'vgg13',
 'vgg13_bn',
 'vgg16',
 'vgg16_bn',
 'vgg19',
 'vgg19_bn',
 'vit_b_16',
 'vit_b_32',
 'vit_h_14',
 'vit_l_16',
 'vit_l_32',
 'wide_resnet101_2',
 'wide_resnet50_2']

In [9]:

# Load pre-trained model of choice.
model = torchvision.models.resnet18(pretrained=True)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 111MB/s]

3.3 Load explanations¶

To gather more insight into how the model made its prediction, we can apply different explanation methods. There exist multiple ways to generate explanations for neural network models e.g., using captum, zennit and tf-explain libraries.

Quantus is compatible with PyTorch, offering 20+ XAI methods for that ML framework.

In [10]:

import quantus

# View the XAI methods available for PyTorch users.
quantus.AVAILABLE_XAI_METHODS_CAPTUM

Out[10]:

['GradientShap',
 'IntegratedGradients',
 'DeepLift',
 'DeepLiftShap',
 'InputXGradient',
 'Saliency',
 'FeatureAblation',
 'Deconvolution',
 'FeaturePermutation',
 'Lime',
 'KernelShap',
 'LRP',
 'Gradient',
 'Occlusion',
 'LayerGradCam',
 'GuidedGradCam',
 'LayerConductance',
 'LayerActivation',
 'InternalInfluence',
 'LayerGradientXActivation',
 'Control Var. Sobel Filter',
 'Control Var. Constant',
 'Control Var. Random Uniform']

In this example, we rely on the quantus.explain functionality (a simple wrapper around captum) however use whatever approach or library you'd like to create your explanations.

In [11]:

#@title 3.3.1 Plotting single method
%%capture
def plot_explanation(input_sample: np.array,
                     pred_name: str,
                     explanation: np.array,
                     img_size: int = 28,
                     normalise: bool = True,
                     denormalise: bool = False,
                     method: str = "Gradient"):
    """Plot an explanation for an input sample."""
    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8, 5))

    if denormalise:
        # ImageNet settings.
        image = (np.moveaxis(quantus.denormalise(input_sample,
                                             mean=np.array([0.485, 0.456, 0.406]),
                                             std=np.array([0.229, 0.224, 0.225])), 0, -1) * 255).astype(np.uint8)
    else:
        image = np.moveaxis(input_sample, 0, 2)

    if normalise:
        explanation = quantus.normalise_by_max(explanation)

    axes[0].imshow(image, vmin=0.0, vmax=1.0)
    axes[0].title.set_text(f"Class {pred_name}")
    axes[0].axis("off");
    attr_ = axes[1].imshow(explanation.reshape(img_size, img_size), cmap="seismic")
    fig.colorbar(attr_, fraction=0.05, pad=0.05);
    axes[1].title.set_text(method)
    axes[1].axis("off")

    plt.show()

def get_pred_name(y_pred_id: int, dataset: str = "imagenet"):
    if dataset == "imagenet":
        return CLASSES[y_pred_id]
    return medmnist.INFO[DATA_FLAG]['label'][str(y_pred_id)]

In [13]:

# Get base explanations.
a_grad = quantus.explain(model, x_batch, y_batch)

# Plot a random explanation sample!
index = np.random.randint(0, len(x_batch)-1)
input_sample = x_batch[index]
pred_name = get_pred_name(y_batch[index])
explanation = a_grad[index]
plot_explanation(input_sample, pred_name, explanation, img_size=224, denormalise=True, normalise=False)

In [14]:

# Inspect documentation for the explanation method.
quantus.explain

Out[14]:

quantus.functions.explanation_func.explain
def explain(model, inputs, targets, **kwargs) -> np.ndarray

/usr/local/lib/python3.10/dist-packages/quantus/functions/explanation_func.pyExplain inputs given a model, targets and an explanation method.
Expecting inputs to be shaped such as (batch_size, nr_channels, ...) or (batch_size, ..., nr_channels).

Parameters
----------
model: torch.nn.Module, tf.keras.Model
A model that is used for explanation.
inputs: np.ndarray
The inputs that ought to be explained.
targets: np.ndarray
The target lables that should be used in the explanation.
kwargs: optional
Keyword arguments. Pass as "explain_func_kwargs" dictionary when working with a metric class.
Pass as regular kwargs when using the stnad-alone function.

xai_lib: string, optional
XAI library: captum, tf-explain or zennit.
method: string, optional
XAI method (used with captum and tf-explain libraries).
attributor: string, optional
XAI method (used with zennit).
xai_lib_kwargs: dictionary, optional
Keyword arguments to be passed to the attribution function.
softmax: boolean, optional
Indicated whether softmax activation in the last layer shall be removed.
channel_first: boolean, optional
Indicates if the image dimensions are channel first, or channel last.
Inferred from the input shape if None.
reduce_axes: tuple
Indicates the indices of dimensions of the output explanation array to be summed. For example, an input
array of shape (8, 28, 28, 3) with keepdims=True and reduce_axes = (-1,) will return an array of shape
(8, 28, 28, -1). Passing "()" will keep the original dimensions.
keepdims: boolean
Indicated if the reduced axes shall be preserved (True) or removed (False).

Returns
-------
explanation: np.ndarray
Returns np.ndarray of same shape as inputs.

In [15]:

# Get Saliency explanations, specify via 'method'.
kwargs = {"method": "Saliency", "xai_lib": "captum"}
a_sal = quantus.explain(model, x_batch, y_batch, **kwargs)

a_sal.shape

Out[15]:

(15, 1, 224, 224)

In [16]:

# Prepare dictionary with explanation methods and hyperparameters.
xai_methods_with_kwargs ={
    #"Occlusion": {"window": (1, 28, 28)},
    "LayerGradCam": {"gc_layer": "list(model.named_modules())[61][1]",  "interpolate": (224, 224),},
    "Saliency": {},
    "GradientShap": {},
    "IntegratedGradients": {"n_steps": 5},
}
NORMALISE = False

# Populate explanation in the dictionary.
explanations = {}
for method, kwargs in xai_methods_with_kwargs.items():

    a_batch = quantus.explain(model=model,
                              inputs=x_batch,
                              targets=y_batch,
                              **{**{"method": method, "xai_lib": "captum"}, **kwargs})


    # Normalise for GradCAM.
    if NORMALISE:
        if a_batch.min() == 0:
            explanations[method] = a_batch/a_batch.max(axis=(1,2), keepdims=True)
        else:
            # If not normalised, normalize by hand to comparable values [0,1].
            explanations[method] = np.abs((a_batch - a_batch.min(axis=(1,2), keepdims=True))/(a_batch.max(axis=(1,2), keepdims=True) -a_batch.min(axis=(1,2), keepdims=True)))

    else:
        explanations[method] = a_batch

    print(f"{method} - {a_batch.shape}")

LayerGradCam - (15, 1, 224, 224)
Saliency - (15, 1, 224, 224)
GradientShap - (15, 1, 224, 224)
IntegratedGradients - (15, 1, 224, 224)

In [17]:

# Or define your own XAI method.

def your_own_random_explainer(model: torch.nn,
                              inputs: np.array,
                              targets: np.array,
                              **kwargs):
    # Dummy explanation.
    size =  kwargs.get("size", (15, 1, 224, 224))
    a_batch = np.random.random(size=size)
    return a_batch

In [18]:

#@title 3.3.2 Write own FusionGrad explainer
%%capture
import copy
import gc
import numpy as np
import torch
import quantus
from captum.attr import Saliency

import torch

def save_model_state(model):
    return {k: v.clone() for k, v in model.state_dict().items()}

def restore_model_state(model, state):
    model.load_state_dict(state)

def fusiongrad_explainer(model, inputs, targets, **kwargs) -> np.ndarray:
    """Implementation of FusionGrad by Bykov et al., 2022."""

    original_state = save_model_state(model)

    # PyTorch and general processing.
    device = kwargs.get("device", "cpu")
    img_size = kwargs.get("img_size", 224)
    nr_channels = kwargs.get("nr_channels", 3)

    # Post-processng attribution.
    abs = kwargs.get("abs", False)
    normalise = kwargs.get("normalise", False)
    normalise_func = kwargs.get("normalise_func", quantus.normalise_by_negative)

    # FusionGrad specific.
    posterior_mean = kwargs.get("posterior_mean", copy.deepcopy(model.to(device).state_dict()))
    mean, std = kwargs.get("mean, std", (1.0, 0.75))
    sg_mean, sg_std = kwargs.get("sg_mean, sg_std", (0.0, 0.25))
    n, m = kwargs.get("n, m", (10, 10))

    # Save the posterior mean, copy of the mode.
    #posterior_mean = copy.deepcopy(model.to(device).state_dict())
    original_parameters = model.state_dict()

    def _sample(model, original_parameters, std, distribution=None, noise_type="multiplicative"):
        """Implementation to sample a model."""

        # Creates a normal (also called Gaussian) distribution.
        distribution = torch.distributions.normal.Normal(loc=torch.as_tensor(mean, dtype=torch.float),
                                                     scale=torch.as_tensor(std, dtype=torch.float))
        # Load model params.
        model_copy = copy.deepcopy(model)
        model_copy.load_state_dict(original_parameters)

        # If std is not zero, loop over each layer and add Gaussian noise.
        if not std == 0.0:
            with torch.no_grad():
                for layer in model_copy.parameters():
                    if noise_type == "additive":
                        layer.add_(distribution.sample(layer.size()).to(layer.device))
                    elif noise_type == "multiplicative":
                        layer.mul_(distribution.sample(layer.size()).to(layer.device))
                    else:
                        print("Set NoiseGrad attribute 'noise_type' to either 'additive' or 'multiplicative' (str).")

        return model_copy

    # Set model in evaluate mode.
    model.to(device)
    model.eval()

    if not isinstance(inputs, torch.Tensor):
        inputs = (torch.Tensor(inputs).reshape(-1, nr_channels, img_size, img_size,).to(device))
    if not isinstance(targets, torch.Tensor):
        targets = torch.as_tensor(targets).long().to(device)

    assert (
        len(np.shape(inputs)) == 4
    ), "Inputs should be shaped (nr_samples, nr_channels, img_size, img_size) e.g., (1, 3, 224, 224)."

    if inputs.shape[0] > 1:
        attrs = torch.zeros((m, n, inputs.shape[0], img_size, img_size,))
    else:
        attrs = torch.zeros((m, n, img_size, img_size))

    for i in range(m):

        # Sample a model.
        model_copy = _sample(model=model, original_parameters=original_parameters, std=std, noise_type="multiplicative")

        for j in range(n):
            # Add noise to the inputs.
            inputs_noisy = inputs + torch.randn_like(inputs) * sg_std + sg_mean

            # Compute the Saliency explanation.
            attrs[i][j] = Saliency(model_copy).attribute(inputs_noisy, targets, abs=abs).sum(axis=1)
            attrs[i][j] = attrs[i][j].reshape(-1, img_size, img_size).cpu().data

    # Average over the samples.
    attrs = attrs.mean(axis=(0, 1))
    attrs = torch.unsqueeze(attrs, 1)

    gc.collect()
    torch.cuda.empty_cache()

    # Add normalisation.
    if normalise:
        attrs = normalise_func(attrs)

    if isinstance(attrs, torch.Tensor):
        if attrs.requires_grad:
            return attrs.cpu().detach().numpy()
        return attrs.cpu().numpy()

    restore_model_state(model, original_state)

    return attrs

In [19]:

# Compare with your own explainer function.
explanations["FusionGrad"] = fusiongrad_explainer(model=model.cpu(), inputs=x_batch, targets=y_batch, **{"m, n": (10, 10),
                                                                                                         "sg_mean, sg_std": (0.0, 0.25),
                                                                                                         "mean, std": (1.0, 0.8),
                                                                                                         "posterior_mean": copy.deepcopy(model.to(device).state_dict())})
explanations["Random"] = your_own_random_explainer(model=model, inputs=x_batch, targets=y_batch, **{"size": (len(x_batch), 1, 224, 224)})

In [20]:

#@title 3.3.3 Plotting multiple methods
%%capture

def plot_explanation_methods(explanations: dict,
                             x_batch: np.array,
                             y_batch: np.array,
                             method_names: list,
                             colours: list,
                             indices: list = [1, 10, 6],
                             img_size: int = 28) -> None:

    # Plotting configs.
    plt.rcParams['ytick.left'] = False
    plt.rcParams['ytick.labelleft'] = False
    plt.rcParams['xtick.bottom'] = False
    plt.rcParams['xtick.labelbottom'] = False

    # Plot explanations!
    index = 1
    ncols = 1 + len(explanations)

    for index in indices:
        fig, axes = plt.subplots(nrows=1, ncols=ncols, figsize=(15, int(ncols)*3))

        for i in range(ncols):

            if i == 0:

                pred_name = get_pred_name(y_pred_id=y_batch[index])

                if len(pred_name) > 30:
                    pred_name = pred_name[:15] + "\n" + pred_name[15:30] + "\n" + pred_name[30:]
                elif len(pred_name) > 15:
                    pred_name = pred_name[:15] + "\n" + pred_name[15:]

                with warnings.catch_warnings():
                    image = (np.moveaxis(quantus.denormalise(x_batch[index],
                                                             mean=np.array([0.485, 0.456, 0.406]),
                                                             std=np.array([0.229, 0.224, 0.225])), 0, -1) * 255).astype(np.uint8)
                    axes[0].imshow(image, vmin=0.0, vmax=1.0)
                    axes[0].set_title(f"{pred_name.title()}", fontsize=12)
                    axes[0].axis("off")
            else:

                axes[i].imshow(explanations[method_names[i-1]][index].reshape(img_size, img_size), cmap="seismic", vmin=-1.0, vmax=1.0)
                axes[i].set_title(f"{method_names[i-1]}", fontsize=12)

                # Frame configs.
                axes[i].xaxis.set_visible([])
                axes[i].yaxis.set_visible([])
                axes[i].spines["top"].set_color(colours[i-1])
                axes[i].spines["bottom"].set_color(colours[i-1])
                axes[i].spines["left"].set_color(colours[i-1])
                axes[i].spines["right"].set_color(colours[i-1])
                axes[i].spines["top"].set_linewidth(5)
                axes[i].spines["bottom"].set_linewidth(5)
                axes[i].spines["left"].set_linewidth(5)
                axes[i].spines["right"].set_linewidth(5)

        plt.show()

In [21]:

# Plot explanation methods!
method_names = list(explanations.keys())
colours = random_hex_codes = ['#' + ''.join(np.random.choice(list('0123456789ABCDEF'), size=6)) for _ in range(len(explanations))]
plot_explanation_methods(explanations, x_batch, y_batch, method_names, colours, img_size=224)

Clearly, there is an unintelligibility of visual comparision.

4) Introduction to Quantus¶

Quantus is a XAI Python open-source framework to evaluate the performance of neural network explanations.

More details can found in at the offical GitHub repository, Getting Started Guide or API documentation.

In [22]:

import quantus

We answer the following research question:

RQ: Without ground truth (or medical expertise), what explanation method to choose to help understand the black box model?

4.1 XAI metrics¶

In the following section, we demonstrate how to use Quantus to evaluate the different explanation methods under various explanation qualities — and their underlying metrics. In the following, we describe each of the categories briefly. The direction of the arrow indicates whether higher or lower values are considered better (exceptions within each category exist, so please carefully read the docstrings of each individual metric prior to usage and/or interpretation). For more complete description of the different properties, please see the official Github repository.

Faithfulness (↑) quantifies to what extent explanations follow the predictive behaviour of the model, asserting that more important features affect model decisions more strongly e.g., (Bach et al., 2015), (Rong, Leemann, et al., 2022) and (Dasgupta et al., 2022).

$$\Psi_\text{F}(\Phi, f, \mathbf{x}, \mathcal{P}, M) = |(f(x) -f(\mathcal{P}_{\mathbb{X}}(\mathbf{x}, M))|.$$

Robustness (↓) measures to what extent explanations are stable when subject to slight perturbations in the input, assuming that the model output approximately stayed the same e.g., (Alvarez-Melis et al., 2018), (Yeh et al., 2019) and ) (Agarwal, et. al., 2022).

$$\Psi_\text{RO}(\Phi, f, \mathbf{x}, \hat{y}, \mathcal{P}) = || \hat{\mathbf{e}}-\Phi(\mathcal{P}_{\mathbb{X}}(\mathbf{x}), f, \hat{y}; \lambda)|| \le \varepsilon.$$

Randomisation (↑, ↓) tests to what extent explanations deteriorate as the data labels or the model, e.g., its parameters are increasingly randomised (Adebayo et. al., 2018), (Sixt et al., 2020) and (Hedström et al., 2020).

$$\Psi_\text{RA}(\Phi, f, \mathbf{x}, \hat{y}, \mathcal{P}, \varepsilon) = || \hat{\mathbf{e}} - \Phi(\mathbf{x}, \mathcal{P}_{\mathbb{F}}(\theta), \hat{y}; \lambda) || \gg \varepsilon.$$

Localisation (↑) tests if the explainable evidence is centred around a region of interest, which may be defined around an object by a bounding box, a segmentation mask or a cell within a grid e.g., (Zhang et al., 2018), (Arras et al., 2021) and (Arias et al., 2022).

$$\Psi_\text{L}( \hat{\mathbf{e}}, \mathbf{s}^{gt}) = \frac{ \hat{\mathbf{e}}\cap \mathbf{s}^{gt}}{ \hat{\mathbf{e}}\cup \mathbf{s}^{gt}}$$

Complexity (↓) captures to what extent explanations are concise, i.e., that few features are used to explain a model prediction e.g., (Chalasani et al., 2020) and (Bhatt et al., 2020).

$$\Psi_\text{C} = \mathbb{E}_i\left[-\ln \left(\mathbb{P}_{\Phi}\right)\right]=-\sum_{i=1}^D \mathbb{P}_{\Phi}(i) \ln \left(\mathbb{P}_{\Phi}(i)\right) \\ \text{with} \quad \mathbb{P}_{\Phi}(i)=\frac{\left|\Phi_i(\mathbf{x}, f, \hat{y} ;\lambda)\right|}{\sum_{j \in[d]}\left|\Phi_j(\mathbf{x}, f, \hat{y} ;\lambda)\right|} ; \mathbb{P}_{{\Phi}}=\left\{\mathbb{P}_{{\Phi}}(1), \ldots, \mathbb{P}_{{\Phi}}(d)\right\}, $$

In [23]:

# In each category of explanation quality, let's view the available metrics.
for k, v in quantus.AVAILABLE_METRICS.items():
    print(k)
    for i in v:
        print(f"\t• {i}")

Faithfulness
	• Faithfulness Correlation
	• Faithfulness Estimate
	• Pixel-Flipping
	• Region Segmentation
	• Monotonicity-Arya
	• Monotonicity-Nguyen
	• Selectivity
	• SensitivityN
	• IROF
	• ROAD
	• Infidelity
	• Sufficiency
Robustness
	• Continuity Test
	• Local Lipschitz Estimate
	• Max-Sensitivity
	• Avg-Sensitivity
	• Consistency
	• Relative Input Stability
	• Relative Output Stability
	• Relative Representation Stability
Localisation
	• Pointing Game
	• Top-K Intersection
	• Relevance Mass Accuracy
	• Relevance Rank Accuracy
	• Attribution Localisation 
	• AUC
	• Focus
Complexity
	• Sparseness
	• Complexity
	• Effective Complexity
Randomisation
	• MPRT
	• Smooth MPRT
	• Efficient MPRT
	• Random Logit
Axiomatic
	• Completeness
	• NonSensitivity
	• InputInvariance

4.1.1 Measure complexity¶

We select one single metric within the complexity category of explanation quality. Sparseness (Chalasani et al., 2020) is quantified using the Gini Index applied to the vector of the absolute values of attributions $\hat{\mathbf{e}}$ of length $D$.

$$\Psi_\text{SP}(\hat{\mathbf{e}}) = \frac{\sum_{i=1}^{D}(2 i-D-1) \hat{\mathbf{e}}_{i}}{D \sum_{i=1}^{D} \hat{\mathbf{e}}_{i}}$$

In [25]:

# Let's try initialising one Complexity metric, called Sparseness.
quantus.PixelFlipping().get_params

Out[25]:

{'abs': False,
 'normalise': True,
 'return_aggregate': False,
 'aggregate_func': <function mean at 0x7870f2d29e30>,
 'normalise_func': <function quantus.functions.normalise_func.normalise_by_max(a: numpy.ndarray, normalise_axes: Optional[Sequence[int]] = None) -> numpy.ndarray>,
 'normalise_func_kwargs': {},
 '_disable_warnings': False,
 '_display_progressbar': False,
 'a_axes': None,
 'features_in_step': 1,
 'return_auc_per_sample': False,
 'perturb_func': functools.partial(<function baseline_replacement_by_indices at 0x786f805155a0>, perturb_baseline='black')}

In [24]:

# Let's try initialising one Complexity metric, called Sparseness.
quantus.Sparseness().get_params

Out[24]:

{'abs': True,
 'normalise': True,
 'return_aggregate': False,
 'aggregate_func': <function mean at 0x7870f2d29e30>,
 'normalise_func': <function quantus.functions.normalise_func.normalise_by_max(a: numpy.ndarray, normalise_axes: Optional[Sequence[int]] = None) -> numpy.ndarray>,
 'normalise_func_kwargs': {},
 '_disable_warnings': False,
 '_display_progressbar': False,
 'a_axes': None}

As a starter, we evaluate Saliency explanation (Mørch et al., 1995; Baehrens et al., 2010)

In [26]:

# Alternative 1. Evaluate the Saliency explanations in a one-liner - by calling the intialised metric.
quantus.Sparseness()(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=explanations["Saliency"])

Out[26]:

[0.4388029590304715,
 0.47652868406704785,
 0.41440676591897163,
 0.46697679949295834,
 0.4414132058788879,
 0.43754174893746944,
 0.45429935647536707,
 0.49777907790547427,
 0.5283665277255211,
 0.4428323014199678,
 0.4146280311647153,
 0.5192187669945634,
 0.49564300585363763,
 0.4010246933427686,
 0.45522917726154716]

In [27]:

# Change some hyperparameters, get an aggregate score over several test samples.
quantus.Sparseness(return_aggregate=True, disable_warnings=True)(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=explanations["Saliency"])

Out[27]:

[0.4589794067646247]

We evaluate GradientShap explanation (Lundberg and Lee, 2017).

In [28]:

# Change the explanation method to evaluate Sparseness on GradientShap.
quantus.Sparseness(return_aggregate=True, disable_warnings=True)(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=explanations["GradientShap"])

Out[28]:

[0.595142746343291]

We evaluate FusionGrad FusionGrad (Bykov et al., 2021 explanation.

In [29]:

# Change the explanation method to evaluate Sparseness on FusionGrad.
quantus.Sparseness(return_aggregate=True, disable_warnings=True)(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=explanations["FusionGrad"])

Out[29]:

[0.48891676021172176]

In [30]:

# Score all methods iteratively.
for method, attr in explanations.items():
    metric = quantus.Sparseness(return_aggregate=False, disable_warnings=True)
    scores = metric(model=model, x_batch=x_batch, y_batch=y_batch, a_batch=attr)
    print(f" {method} - {np.mean(scores):.2f} ({np.std(scores):.2f})")

 LayerGradCam - 0.31 (0.06)
 Saliency - 0.46 (0.04)
 GradientShap - 0.60 (0.04)
 IntegratedGradients - 0.59 (0.04)
 FusionGrad - 0.49 (0.01)
 Random - 0.33 (0.00)

To structure the analysis a bit futher, you can leverage the built-in functionality of quantus.evaluate().

4.2 Large-scale evaluation¶

In the following, we use Quantus to quantiatively assess the different explanation methods on various evaluation criteria.

In [31]:

# Initialise the Quantus evaluation metrics.
metrics = {
    "Robustness": quantus.MaxSensitivity(
        nr_samples=10,
        lower_bound=0.2,
        norm_numerator=quantus.norm_func.fro_norm,
        norm_denominator=quantus.norm_func.fro_norm,
        perturb_func=quantus.perturb_func.uniform_noise,
        similarity_func=quantus.similarity_func.difference,
        abs=False,
        normalise=True,
        normalise_func=quantus.normalise_by_max,
        aggregate_func=np.mean,
        return_aggregate=True,
        disable_warnings=True,
    ),
    "Faithfulness": quantus.FaithfulnessCorrelation(
        nr_runs=10,
        subset_size=224,
        perturb_baseline="black",
        perturb_func=quantus.baseline_replacement_by_indices,
        similarity_func=quantus.similarity_func.correlation_pearson,
        abs=True,
        normalise=True,
        normalise_func=quantus.normalise_by_max,
        aggregate_func=np.mean,
        return_aggregate=True,
        disable_warnings=True,
    ),
    "Localisation": quantus.RelevanceRankAccuracy(
        abs=False,
        normalise=True,
        normalise_func=quantus.normalise_by_max,
        aggregate_func=np.mean,
        return_aggregate=True,
        disable_warnings=True,
    ),
    "Complexity": quantus.Sparseness(
        abs=True,
        normalise=True,
        normalise_func=quantus.normalise_by_max,
        aggregate_func=np.mean,
        return_aggregate=True,
        disable_warnings=True,
    ),
    "Sensitivity": quantus.EfficientMPRT(
        similarity_func=quantus.similarity_func.correlation_spearman,
        abs=False,
        normalise=True,
        normalise_func=quantus.normalise_by_max,
        aggregate_func=np.mean,
        return_aggregate=True,
        disable_warnings=True,
    ),
}

In [32]:

quantus.evaluate

Out[32]:

quantus.evaluation.evaluate
def evaluate(metrics: Dict, xai_methods: Union[Dict[str, Callable], Dict[str, Dict], Dict[str, np.ndarray]], model: ModelInterface, x_batch: np.ndarray, y_batch: np.ndarray, s_batch: Union[np.ndarray, None]=None, agg_func: Callable=lambda x: x, explain_func_kwargs: Optional[dict]=None, call_kwargs: Union[Dict, Dict[str, Dict]]=None, return_as_df: Optional[bool]=None, verbose: Optional[bool]=None, progress: Optional[bool]=None, *args, **kwargs) -> Optional[dict]

/usr/local/lib/python3.10/dist-packages/quantus/evaluation.pyEvaluate different explanation methods using specified metrics.

Parameters
----------
metrics : dict
    A dictionary of initialized evaluation metrics. See quantus.AVAILABLE_METRICS.
    Example: {'Robustness': quantus.MaxSensitivity(), 'Faithfulness': quantus.PixelFlipping()}

xai_methods : dict
    A dictionary specifying the explanation methods to evaluate, which can be structured in three ways:

    1) Dict[str, Dict] for built-in Quantus methods (using quantus.explain):

        Example:
        xai_methods = {
            'IntegratedGradients': {
                'n_steps': 10,
                'xai_lib': 'captum'
            },
            'Saliency': {
                'xai_lib': 'captum'
            }
        }

        - See quantus.AVAILABLE_XAI_METHODS_CAPTUM for supported captum methods.
        - See quantus.AVAILABLE_XAI_METHODS_TF for supported tensorflow methods.
        - See https://github.com/chr5tphr/zennit for supported zennit methods.
        - Read more about the explanation function arguments here:
          <https://quantus.readthedocs.io/en/latest/docs_api/quantus.functions.explanation_func.html#quantus.functions.explanation_func.explain>

    2) Dict[str, Callable] for custom methods:

        Example:
        xai_methods = {
            'custom_own_xai_method': custom_explain_function
        }
        or
        ai_methods = {"InputXGradient": {
    "explain_func": quantus.explain,
    "explain_func_kwargs": {},
}}

        - Here, you can provide your own callable that mirrors the input and outputs of the quantus.explain() method.

    3) Dict[str, np.ndarray] for pre-calculated attributions:

        Example:
        xai_methods = {
            'LIME': precomputed_numpy_lime_attributions,
            'GradientShap': precomputed_numpy_shap_attributions
        }

        - Note that some Quantus metrics, e.g., quantus.MaxSensitivity() within the robustness
        category, includes "re-explaning" the input and output pair as a part of the evaluation metric logic.
        If you include such metrics in the quantus.evaluate(), this option will not be possible.

    It is also possible to pass a combination of the above.

    >>> xai_methods = {
    >>>     'IntegratedGradients': {
    >>>         'n_steps': 10,
    >>>         'xai_lib': 'captum'
    >>>     },
    >>>     'Saliency': {
    >>>         'xai_lib': 'captum'
    >>>     },
    >>>     'custom_own_xai_method': custom_explain_function,
    >>>     'LIME': precomputed_numpy_lime_attributions,
    >>>     'GradientShap': precomputed_numpy_shap_attributions
    >>> }

model: Union[torch.nn.Module, tf.keras.Model]
    A torch or tensorflow model that is subject to explanation.

x_batch: np.ndarray
    A np.ndarray containing the input data to be explained.

y_batch: np.ndarray
    A np.ndarray containing the output labels corresponding to x_batch.

s_batch: np.ndarray, optional
    A np.ndarray containing segmentation masks that match the input.

agg_func: Callable
    Indicates how to aggregate scores, e.g., pass np.mean.

explain_func_kwargs: dict, optional
    Keyword arguments to be passed to explain_func on call. Pass None if using Dict[str, Dict] type for xai_methods.

call_kwargs: Dict[str, Dict]
    Keyword arguments for the call of the metrics. Keys are names for argument sets, and values are argument dictionaries.

verbose: optional, bool
    Indicates whether to print evaluation progress.

progress: optional, bool
    Deprecated. Indicates whether to print evaluation progress. Use verbose instead.

return_as_df: optional, bool
    Indicates whether to return the results as a pd.DataFrame. Only works if call_kwargs is not passed.

args: optional
    Deprecated arguments for the call.

kwargs: optional
    Deprecated keyword arguments for the call of the metrics.

Returns
-------
results: dict
    A dictionary with the evaluation results.

In [33]:

# Re-define XAI methods to score.
xai_methods_with_kwargs ={
    #"Occlusion": {"window": (1, 28, 28)},
    "LayerGradCam": {"gc_layer": "list(model.named_modules())[61][1]",  "interpolate": (224, 224),},
    "Saliency": {},
    "GradientShap": {},
    "IntegratedGradients": {"n_steps": 5},
    "FusionGrad": fusiongrad_explainer,
    "Random": your_own_random_explainer,
}

In [34]:

# Run full quantification analysis!
load_obj = True

if not load_obj:
    # Evaluate XAI methods.
    results = quantus.evaluate(metrics=metrics,
                               xai_methods=xai_methods_with_kwargs,
                               model=model.cpu(),
                               x_batch=x_batch,
                               y_batch=y_batch,
                               s_batch=s_batch,
                               agg_func=np.mean,
                               explain_func_kwargs=None,
                               call_kwargs=None,
                               return_as_df=False,
                               verbose=True)
else:
    # Retrieve stored data.
    results = {
    "LayerGradCam": {
        "Robustness": 6.385364405314127,
        "Faithfulness": 0.012830701450853242,
        "Localisation": 0.5945537119866623,
        "Complexity": 0.534340942620482,
        "Sensitivity": -0.011923145576686991,
    },
    "Saliency": {
        "Robustness": 0.8709124883015951,
        "Faithfulness": 0.11204023012353007,
        "Localisation": 0.5945537119866623,
        "Complexity": 0.534340942620482,
        "Sensitivity": 0.10401733999005232,
    },
    "GradientShap": {
        "Robustness": 1.8177134116490683,
        "Faithfulness": 0.12337100760315674,
        "Localisation": 0.5945537119866623,
        "Complexity": 0.534340942620482,
        "Sensitivity": 0.03112477971924486,
    },
    "IntegratedGradients": {
        "Robustness": 1.1781942049662273,
        "Faithfulness": 0.1820758061666603,
        "Localisation": 0.5945537119866623,
        "Complexity": 0.534340942620482,
        "Sensitivity": 0.017489182288717948,
    },
    "FusionGrad": {
        "Robustness": 1.395110293229421,
        "Faithfulness": 0.1478673954202166,
        "Localisation": 0.5614189583462068,
        "Complexity": 0.4727508018004431,
        "Sensitivity": -0.005249520387875365,
    },
    "Random": {
        "Robustness": 0.7098723664216617,
        "Faithfulness": 0.13276692371798132,
        "Localisation": 0.5101552561594804,
        "Complexity": 0.3331452921782692,
        "Sensitivity": -4.37913867605053e-06,
    },
}

results

Out[34]:

{'LayerGradCam': {'Robustness': 6.385364405314127,
  'Faithfulness': 0.012830701450853242,
  'Localisation': 0.5945537119866623,
  'Complexity': 0.534340942620482,
  'Sensitivity': -0.011923145576686991},
 'Saliency': {'Robustness': 0.8709124883015951,
  'Faithfulness': 0.11204023012353007,
  'Localisation': 0.5945537119866623,
  'Complexity': 0.534340942620482,
  'Sensitivity': 0.10401733999005232},
 'GradientShap': {'Robustness': 1.8177134116490683,
  'Faithfulness': 0.12337100760315674,
  'Localisation': 0.5945537119866623,
  'Complexity': 0.534340942620482,
  'Sensitivity': 0.03112477971924486},
 'IntegratedGradients': {'Robustness': 1.1781942049662273,
  'Faithfulness': 0.1820758061666603,
  'Localisation': 0.5945537119866623,
  'Complexity': 0.534340942620482,
  'Sensitivity': 0.017489182288717948},
 'FusionGrad': {'Robustness': 1.395110293229421,
  'Faithfulness': 0.1478673954202166,
  'Localisation': 0.5614189583462068,
  'Complexity': 0.4727508018004431,
  'Sensitivity': -0.005249520387875365},
 'Random': {'Robustness': 0.7098723664216617,
  'Faithfulness': 0.13276692371798132,
  'Localisation': 0.5101552561594804,
  'Complexity': 0.3331452921782692,
  'Sensitivity': -4.37913867605053e-06}}

Postprocessing of scores, computing how the different explanation methods rank across criteria.

In [35]:

# Postprocessing of scores: to get how the different explanation methods rank across criteria.
results_agg = {}
for method in xai_methods_with_kwargs:
    results_agg[method] = {}
    for metric, metric_func in metrics.items():
        results_agg[method][metric] = np.mean(results[method][metric])

df = pd.DataFrame.from_dict(results_agg)
df = df.T.abs()
df

Out[35]:

	Robustness	Faithfulness	Localisation	Complexity	Sensitivity
LayerGradCam	6.385364	0.012831	0.594554	0.534341	0.011923
Saliency	0.870912	0.112040	0.594554	0.534341	0.104017
GradientShap	1.817713	0.123371	0.594554	0.534341	0.031125
IntegratedGradients	1.178194	0.182076	0.594554	0.534341	0.017489
FusionGrad	1.395110	0.147867	0.561419	0.472751	0.005250
Random	0.709872	0.132767	0.510155	0.333145	0.000004

To compare the different XAI methods, we normalise the metric scores between $[0, 1]$ and rank the scores from lowest to highest (i.e. the highest rank corresponds to best performance).

In [41]:

# Take inverse ranking for Robustness, since lower is better.
df_normalised = df.loc[:, ~df.columns.isin(['Robustness'])].apply(lambda x: x / x.max())
df_normalised["Robustness"] = df["Robustness"].min()/df["Robustness"].values
df_normalised_rank = df_normalised.rank()
df_normalised_rank

Out[41]:

	Faithfulness	Localisation	Complexity	Sensitivity	Robustness
LayerGradCam	1.0	4.5	4.5	3.0	1.0
Saliency	2.0	4.5	4.5	6.0	5.0
GradientShap	3.0	4.5	4.5	5.0	2.0
IntegratedGradients	6.0	4.5	4.5	4.0	4.0
FusionGrad	5.0	2.0	2.0	2.0	3.0
Random	4.0	1.0	1.0	1.0	6.0

In [38]:

#@title 4.2.1 Create spyder plot
%%capture

# Plotting specifics.
from matplotlib.patches import Circle, RegularPolygon
from matplotlib.path import Path
from matplotlib.projections.polar import PolarAxes
from matplotlib.projections import register_projection
from matplotlib.spines import Spine
from matplotlib.transforms import Affine2D

# Plotting configs.
sns.set(font_scale=1.5)
plt.style.use('seaborn-white')
plt.rcParams['ytick.labelleft'] = True
plt.rcParams['xtick.labelbottom'] = True

include_titles = True
include_legend = True

# Source code: https://matplotlib.org/stable/gallery/specialty_plots/radar_chart.html.

def spyder_plot(num_vars, frame='circle'):
    """Create a radar chart with `num_vars` axes.

    This function creates a RadarAxes projection and registers it.

    Parameters
    ----------
    num_vars : int
        Number of variables for radar chart.
    frame : {'circle' | 'polygon'}
        Shape of frame surrounding axes.
    """
    # calculate evenly-spaced axis angles
    theta = np.linspace(0, 2*np.pi, num_vars, endpoint=False)

    class RadarAxes(PolarAxes):

        name = 'radar'

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            # rotate plot such that the first axis is at the top
            self.set_theta_zero_location('N')

        def fill(self, *args, closed=True, **kwargs):
            """Override fill so that line is closed by default."""
            return super().fill(closed=closed, *args, **kwargs)

        def plot(self, *args, **kwargs):
            """Override plot so that line is closed by default."""
            lines = super().plot(*args, **kwargs)
            for line in lines:
                self._close_line(line)

        def _close_line(self, line):
            x, y = line.get_data()
            # FIXME: markers at x[0], y[0] get doubled-up
            if x[0] != x[-1]:
                x = np.concatenate((x, [x[0]]))
                y = np.concatenate((y, [y[0]]))
                line.set_data(x, y)

        def set_varlabels(self, labels, angles=None):
            self.set_thetagrids(angles=np.degrees(theta), labels=labels)

        def _gen_axes_patch(self):
            # The Axes patch must be centered at (0.5, 0.5) and of radius 0.5
            # in axes coordinates.
            if frame == 'circle':
                return Circle((0.5, 0.5), 0.5)
            elif frame == 'polygon':
                return RegularPolygon((0.5, 0.5), num_vars,
                                      radius=.5, edgecolor="k")
            else:
                raise ValueError("unknown value for 'frame': %s" % frame)

        def draw(self, renderer):
            """ Draw. If frame is polygon, make gridlines polygon-shaped."""
            if frame == 'polygon':
                gridlines = self.yaxis.get_gridlines()
                for gl in gridlines:
                    gl.get_path()._interpolation_steps = num_vars
            super().draw(renderer)


        def _gen_axes_spines(self):
            if frame == 'circle':
                return super()._gen_axes_spines()
            elif frame == 'polygon':
                # spine_type must be 'left'/'right'/'top'/'bottom'/'circle'.
                spine = Spine(axes=self,
                              spine_type='circle',
                              path=Path.unit_regular_polygon(num_vars))
                # unit_regular_polygon gives a polygon of radius 1 centered at
                # (0, 0) but we want a polygon of radius 0.5 centered at (0.5,
                # 0.5) in axes coordinates.
                spine.set_transform(Affine2D().scale(.5).translate(.5, .5)
                                    + self.transAxes)

                return {'polar': spine}
            else:
                raise ValueError("unknown value for 'frame': %s" % frame)

    register_projection(RadarAxes)

    return theta

XAI evaluation can help researchers establish appropriate explanation methods for a specific tasks. These performance measures can help validate network models and prediction as well as insights inferred from explanations.

In [44]:

# Make spyder graph!
data = [df_normalised_rank.columns.values, (df_normalised_rank.to_numpy())]
theta = spyder_plot(len(data[0]), frame='polygon')
spoke_labels = data.pop(0)

fig, ax = plt.subplots(figsize=(9, 9), subplot_kw=dict(projection='radar'))
fig.subplots_adjust(top=0.85, bottom=0.05)
for i, (d, method) in enumerate(zip(data[0], xai_methods_with_kwargs)):
    line = ax.plot(theta, d, label=method, color=colours[i], linewidth=5.0)
    ax.fill(theta, d, alpha=0.15)

# Set lables.
if include_titles:
    ax.set_varlabels(labels=['Faithfulness',
                             'Localisation',
                             'Complexity',
                             'Sensitivity',
                             'Robustness']) #
else:
    ax.set_varlabels(labels=[])

ax.set_rgrids(np.arange(0, df_normalised_rank.values.max() + 0.5), labels=[])

# Set a title.
ax.set_title("Quantus: Summary of Quantification",  position=(0.5, 1.1), ha='center', fontsize=15)

# Put a legend to the right of the current axis.
if include_legend:
    ax.legend(loc='upper left', bbox_to_anchor=(1, 0.5))

plt.tight_layout()

No clear winner. Many explanations score poorly in absolute terms.

Limitations. XAI evaluation faces certain limitations due to the absence of a reliable ground-truth, which means the evaluation metrics provided can only assess crucial properties that a valid explanation must possess, and cannot provide a complete validation. While the evaluation of XAI methods is a rapidly evolving field, the metrics offered by the Quantus library have certain limitations, such as relying on perturbing the input which may lead to the creation of out-of-distribution inputs. It should be noted that evaluating explanation methods using quantification analysis does not guarantee the theoretical soundness or statistical validity of the methods. Therefore, when using the Quantus library for XAI method selection, it is essential to supplement the results with theoretical considerations.

5) Introduction to MetaQuantus¶

MetaQuantus is an open-source, development tool for XAI researchers and Machine Learning (ML) practitioners to verify and benchmark newly constructed metrics (i.e., ``quality estimators''). It includes:

A series of pre-built tests such as ModelPerturbationTest and InputPertubrationTest that can be applied to various metrics
Supporting source code such as for plotting and analysis
Various tutorials e.g., Getting-Started-with-MetaQuantus and Reproduce-Paper-Experiments

More details can found in at the offical GitHub repository.

In [45]:

import metaquantus

We answer the following research question:

RQ: Without ground truth, how can we identify a reliable estimator of explanation quality?

In [46]:

# Re-load data back.
assets = np.load("assets/imagenet_test_set.npy", allow_pickle=True).item()
x_batch = assets["x_batch"][:nr_samples]
y_batch = assets["y_batch"][:nr_samples]
s_batch = assets["s_batch"].reshape(-1, 1, 224, 224)[:nr_samples]

# Load pre-trained model of choice.
model = torchvision.models.resnet18(pretrained=True)

5.1 Load estimators¶

In [47]:

# Define a set of estimators using Quantus, a broader set of Localisation metrics.

estimators_localisation = {
    "Localisation": {
            "Pointing-Game": {
                "init":
                quantus.PointingGame(
                    abs=False,
                    normalise=True,
                    normalise_func=quantus.normalise_func.normalise_by_max,
                    normalise_func_kwargs={},
                    return_aggregate=False,
                    aggregate_func=np.mean,
                    disable_warnings=True,
             ), "score_direction": "higher"},
             "Top-K Intersection": {
                 "init":
                     quantus.TopKIntersection(
                k=10,
                abs=False,
                normalise=True,
                normalise_func=quantus.normalise_func.normalise_by_max,
                normalise_func_kwargs={},
                return_aggregate=False,
                aggregate_func=np.mean,
                disable_warnings=True,
             ), "score_direction": "higher"},
             "Relevance Rank Accuracy": {
                 "init":
                     quantus.RelevanceRankAccuracy(
                abs=False,
                normalise=True,
                normalise_func=quantus.normalise_func.normalise_by_max,
                normalise_func_kwargs={},
                return_aggregate=False,
                aggregate_func=np.mean,
                disable_warnings=True,
             ), "score_direction": "higher"},
            "Relevance Mass Accuracy": {
                "init":
                quantus.RelevanceMassAccuracy(
                    abs=False,
                    normalise=True,
                    normalise_func=quantus.normalise_func.normalise_by_max,
                    normalise_func_kwargs={},
                    return_aggregate=False,
                    aggregate_func=np.mean,
                    disable_warnings=True,
                ),
                "score_direction": "higher",
            },
        }
    }

5.2 Define test suite¶

We define the Input Perturbation Test and Model Perturbation Test in order to evaluate the estimators.

In [49]:

# Define test suite.
test_suite = {
        "Model Resilience Test": metaquantus.ModelPerturbationTest(
            **{
                "noise_type": "multiplicative",
                "mean": 1.0,
                "std": 0.001,
                "type": "Resilience",
            }
        ),
        "Model Adversary Test": metaquantus.ModelPerturbationTest(
            **{
                "noise_type": "multiplicative",
                "mean": 1.0,
                "std": 0.5,
                "type": "Adversary",
            }
        ),
        "Input Resilience Test": metaquantus.InputPerturbationTest(
            **{
                "noise": 0.001,
                "type": "Resilience",
            }
        ),
        "Input Adversary Test": metaquantus.InputPerturbationTest(
            **{
                "noise": 3.0,
                "type": "Adversary",
            }
        ),
    }

5.3 Meta-evaluate methods¶

In [50]:

load_obj = True

if not load_obj:

    # Set configs.
    iters = 5
    K = 10

    # Define the meta-evaluation exercise.
    meta_evaluator = metaquantus.MetaEvaluation(
        test_suite=test_suite,
        xai_methods={"Saliency": {}, "GradientShap": {}},
        iterations=iters,
        nr_perturbations=K,
        write_to_file=False,
    )

    # Collect the settings for the dataset.
    dataset_settings = {}
    dataset_settings["ImageNet"] = {
        "x_batch": x_batch,
        "y_batch": y_batch,
        "s_batch": s_batch,
        "models": {
            "ResNet18": model,
        },
        "gc_layers": {
            "ResNet18": "list(model.named_modules())[61][1]",
        },
        "estimator_kwargs": {
            "num_classes": 1000,
            "img_size": 224,
            "features": 224 * 4,
            "percentage": 0.1,
            "nr_channels": 3,
            "patch_size": 224 * 2,
            "perturb_baseline": "uniform",
        },
    }

    # Benchmark localisation metrics, using the intialised meta-evaluator.
    benchmark = metaquantus.MetaEvaluationBenchmarking(
        master=meta_evaluator,
        estimators=estimators_localisation,
        experimental_settings=dataset_settings,
        write_to_file=False,
        keep_results=True,
        channel_first=True,
        softmax=False,
        device=device,
    )()

else:
    !gdown https://drive.google.com/drive/folders/1uOjuMtbNkvLPqXrbozVjsDNeCxZ6peqs --folder --quiet
    !ls

# Load the benchmarking data for each model.
dataset_name = "ImageNet"
f_loc = [f.split("transformer_data/")[1] for f in glob.glob("transformer_data/*") if "localisation" in f]
batches = len(f_loc)

benchmarks_loc = {}
for model in ["ResNet18"]:
    benchmarks_loc[model] = {}
    for batch, f in enumerate(f_loc):
        benchmarks_loc[model][batch] = metaquantus.load_obj("transformer_data/", fname=f"{f}", use_json=True)[dataset_name][model]

assets	sample_data  transformer_data

In [51]:

#@title 5.3.1 Plotting functionality
%%capture
from typing import Dict

def plot_multiple_models_estimator_area(
        benchmarks: Dict,
        estimators: Dict,
        dataset_name: str,
        colours: Dict,
        save: bool,
        path: str,
        average_over: list = ["Model", "Input"],
        **kwargs
) -> None:
    """
    Plot the outcome of the benchmarking exercise for different models.

    Parameters
    ----------
    benchmark: dict of dicts
        The benchmarking data, keys are the model names.
    estimators: dict
        The estimators used in the experiment.
    dataset_name: str
        The name of the dataset.
    colours: dict
        Dictionary of colours, based on the metrics.
    save: boolean
        Indicates if plots should be saved.
    path: str
        The path for saving the plot.
    average_over: list
        A list of spaces to average over.
    kwargs: dict
        A dict with plotting kwargs.

    Returns
    -------
    None
    """
    n_rows = kwargs.get("n_rows", 2)
    n_cols = kwargs.get("n_cols", 5)
    batches = kwargs.get("batches", 5)
    figsize = kwargs.get("figsize", (20, 8))

    fig, axs = plt.subplots(n_rows, n_cols, sharex=True, figsize=figsize)

    models = list(benchmarks.keys())
    metrics = list(estimators.values())[0]
    estimator_category = list(estimators.keys())[0]

    for mx1, model_name in enumerate(models):
        for ex1, estimator_name in enumerate(metrics):
            mc_scores = []
            for px, perturbation_type in enumerate(["Input", "Model"]):
                scores = {"IAC_NR": [], "IAC_AR": [], "IEC_NR": [], "IEC_AR": []}
                for batch in range(batches):
                    # Collect scores.
                    scores["IAC_NR"].append(np.array(
                        benchmarks[model_name][batch][estimator_category][estimator_name][
                            "results_consistency_scores"
                        ][perturbation_type]["intra_scores_res"]
                    ))
                    scores["IAC_AR"].append(np.array(
                        benchmarks[model_name][batch][estimator_category][estimator_name][
                            "results_consistency_scores"
                        ][perturbation_type]["intra_scores_adv"]
                    ))
                    scores["IEC_NR"].append(np.array(
                        benchmarks[model_name][batch][estimator_category][estimator_name][
                            "results_consistency_scores"
                        ][perturbation_type]["inter_scores_res"]
                    ))
                    scores["IEC_AR"].append(np.array(
                        benchmarks[model_name][batch][estimator_category][estimator_name][
                            "results_consistency_scores"
                        ][perturbation_type]["inter_scores_adv"]
                    ))

                for k, v in scores.items():
                    scores[k] = np.array(scores[k]).flatten()

                # Set values for m* and the actual values by the estimator.
                X_gt = [-1, 0, 1, 0]
                Y_gt = [0, 1, 0, -1]
                X_area = [-scores["IAC_AR"].mean(), 0, scores["IEC_AR"].mean(), 0]
                Y_area = [0, scores["IAC_NR"].mean(), 0, -scores["IEC_NR"].mean()]

                # Set the spaces to average the MC value over.
                if perturbation_type in average_over:
                    mc_score = np.mean(
                        [
                            scores["IAC_NR"].mean(),
                            scores["IEC_NR"].mean(),
                            scores["IAC_AR"].mean(),
                            scores["IEC_AR"].mean(),
                        ]
                    )
                    mc_scores.append(mc_score)

                if perturbation_type == "Input":
                    axs[ex1].fill(
                        X_area,
                        Y_area,
                        color=colours[estimator_name],
                        alpha=0.75,
                        label=perturbation_type,
                        edgecolor="black",
                    )
                else:
                    axs[ex1].fill(
                        X_area,
                        Y_area,
                        color=colours[estimator_name],
                        alpha=0.5,
                        label=perturbation_type,
                        hatch="/",
                        edgecolor="black",
                    )

                # Plot m*.
                if px == 1:
                    axs[ex1].fill(
                        X_gt, Y_gt, color="black", alpha=0.075, label="m*"
                    )

                # Annotate the labels.
                axs[ex1].annotate("${IAC}_{AR}$", (-1, 0), fontsize=12)
                axs[ex1].annotate("${IAC}_{NR}$", (-0.2, 0.8), fontsize=12)
                axs[ex1].annotate("${IEC}_{AR}$", (0.7, 0), fontsize=12)
                axs[ex1].annotate("${IEC}_{NR}$", (-0.2, -0.8), fontsize=12)

            # Labels.
            axs[ex1].set_xticklabels(
                ["", "1", "0.5", "0", "0.5", "1"], fontsize=14
            )
            axs[ex1].set_yticklabels(
                ["", "1", "", "0.5", "", "0", "", "0.5", "", "1", ""], fontsize=14
            )

            if ex1 == 0:
                axs[ex1].set_ylabel(model_name, fontsize=14)

            if estimator_name == "Model Parameter Randomisation Test":
                estimator_name = "Model Parameter Random."

            # Title and grids.
            axs[ex1].set_title(
                f"{estimator_name} ({np.array(mc_scores).flatten().mean():.4f})",
                fontsize=15,
            )
            axs[ex1].grid()
            axs[ex1].legend(loc="upper left")
            plt.grid()

    plt.tight_layout()
    if save:
        plt.savefig(path + "plots/" + f"full_area_graph_{estimator_category}_{dataset_name}_multiple_models.png", dpi=500)
    plt.show()

In [52]:

# Plotting settings.
estimators = {"Localisation": ["Pointing-Game", "Top-K Intersection", "Relevance Mass Accuracy", "Relevance Rank Accuracy"],}
colours = {
    'Pointing-Game': "#b66a50",
    'Top-K Intersection': "#9dbcd4",
    'Relevance Mass Accuracy': "#7f7053",
    'Relevance Rank Accuracy': "#8fb67b",
 }
kwargs = {"n_rows": 1, "n_cols": len(list(estimators.values())[0]), "batches": batches}
kwargs["figsize"] = (kwargs["n_cols"]*4, kwargs["n_rows"]*4)

# Plot!
plot_multiple_models_estimator_area(benchmarks=benchmarks_loc, estimators=estimators, dataset_name=dataset_name, colours=colours, save=False, path="", **kwargs);

6) Parameter Sensitivity Analysis¶

We will investigate how much different parameters influence the evaluation outcome, i.e., how different explanations methods rank.

We use Faithfulness Correlation by Bhatt et al., 2020 for this example.

First, we need to re-load a dataset and model.

In [53]:

import quantus

We answer the following research question:

RQ: Without ground truth of evaluation, what parameterisation results in the most reliable estimate of explanation quality?

6.1 Load vision dataset¶

Fanastic source of toy-dataset in the medical domain MedMNIST.

In [54]:

# Load vision medical dataset.

import medmnist

train_set, test_set = medmnist.DermaMNIST(split="train", download=True), medmnist.DermaMNIST(split="test", download=True)
print(f"\n{train_set}")

Downloading https://zenodo.org/records/10519652/files/dermamnist.npz?download=1 to /root/.medmnist/dermamnist.npz

100%|██████████| 19725078/19725078 [00:45<00:00, 436630.96it/s]

Using downloaded and verified file: /root/.medmnist/dermamnist.npz

Dataset DermaMNIST of size 28 (dermamnist)
    Number of datapoints: 7007
    Root location: /root/.medmnist
    Split: train
    Task: multi-class
    Number of channels: 3
    Meaning of labels: {'0': 'actinic keratoses and intraepithelial carcinoma', '1': 'basal cell carcinoma', '2': 'benign keratosis-like lesions', '3': 'dermatofibroma', '4': 'melanoma', '5': 'melanocytic nevi', '6': 'vascular lesions'}
    Number of samples: {'train': 7007, 'val': 1003, 'test': 2005}
    Description: The DermaMNIST is based on the HAM10000, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. The dataset consists of 10,015 dermatoscopic images categorized as 7 different diseases, formulized as a multi-class classification task. We split the images into training, validation and test set with a ratio of 7:1:2. The source images of 3×600×450 are resized into 3×28×28.
    License: CC BY-NC 4.0

In [55]:

# Visualise some data.
train_set.montage(length=5)

Out[55]:

In [56]:

# Inspect shapes.
x, y = train_set[0]
np.shape(x), np.shape(y)

Out[56]:

((28, 28, 3), (1,))

6.2 Load neural network¶

Adapted training methdology as provided in MEDMNIST tutorial.

In [57]:

# @title 6.2.1 Supporting code
%%capture

import torchvision.transforms as transforms
import torch.utils.data as data


# Define a simple black-box CNN model.
class BlackBoxModel(torch.nn.Module):
    def __init__(self, in_channels, num_classes):
        super(BlackBoxModel, self).__init__()

        self.layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels, 16, kernel_size=3),
            torch.nn.BatchNorm2d(16),
            torch.nn.ReLU())

        self.layer2 = torch.nn.Sequential(
            torch.nn.Conv2d(16, 16, kernel_size=3),
            torch.nn.BatchNorm2d(16),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2))

        self.layer3 = torch.nn.Sequential(
            torch.nn.Conv2d(16, 64, kernel_size=3),
            torch.nn.BatchNorm2d(64),
            torch.nn.ReLU())

        self.layer4 = torch.nn.Sequential(
            torch.nn.Conv2d(64, 64, kernel_size=3),
            torch.nn.BatchNorm2d(64),
            torch.nn.ReLU())

        self.layer5 = torch.nn.Sequential(
            torch.nn.Conv2d(64, 64, kernel_size=3, padding=1),
            torch.nn.BatchNorm2d(64),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2))

        self.fc = torch.nn.Sequential(
            torch.nn.Linear(64 * 4 * 4, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, num_classes))

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.layer5(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

def train_model(model, optimizer, criterion):

    # Classic training with torch; forward and backward and optimize.
    for epoch in range(EPOCHS):
        train_correct = 0
        train_total = 0
        test_correct = 0
        test_total = 0

        model.train()
        for inputs, targets in tqdm.tqdm(train_loader):
            optimizer.zero_grad()
            outputs = model(inputs)

            if TASK == 'multi-label, binary-class':
                targets = targets.to(torch.float32)
                loss = criterion(outputs, targets)
            else:
                targets = targets.squeeze().long()
                loss = criterion(outputs, targets)

            loss.backward()
            optimizer.step()

    return model

def evaluate_model(split: str) -> None:
    model.eval()
    y_true = torch.tensor([])
    y_score = torch.tensor([])

    data_loader = train_loader_at_eval if split == 'train' else test_loader

    with torch.no_grad():
        for inputs, targets in data_loader:
            outputs = model(inputs)

            if TASK == 'multi-label, binary-class':
                targets = targets.to(torch.float32)
                outputs = outputs.softmax(dim=-1)
            else:
                targets = targets.squeeze().long()
                outputs = outputs.softmax(dim=-1)
                targets = targets.float().resize_(len(targets), 1)

            y_true = torch.cat((y_true, targets), 0)
            y_score = torch.cat((y_score, outputs), 0)

        y_true = y_true.numpy()
        y_score = y_score.detach().numpy()

        evaluator = medmnist.Evaluator(DATA_FLAG, split)
        metrics = evaluator.evaluate(y_score)

        print('%s Model performance  AUC: %.3f  ACC: %.3f' % (split, *metrics))

In [58]:

# Hyperparams for dataset.
DATA_FLAG = 'dermamnist'
EPOCHS = 10
BATCH_SIZE = 64
TASK = medmnist.INFO[DATA_FLAG]['task']
data_class = getattr(medmnist, medmnist.INFO[DATA_FLAG]['python_class'])

# Preprocessing.
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[.5], std=[.5])
])

# Load the data info.
train_dataset = data_class(split='train', transform=transform, download=True)
test_dataset = data_class(split='test', transform=transform, download=True)
pil_dataset = data_class(split='train', download=True)

# Load into dataloader.
train_loader = data.DataLoader(dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True)
train_loader_at_eval = data.DataLoader(dataset=train_dataset, batch_size=2*BATCH_SIZE, shuffle=False)
test_loader = data.DataLoader(dataset=test_dataset, batch_size=2*BATCH_SIZE, shuffle=False)

# Hyperparams for model.
LR = 0.001
N_CHANNELS = medmnist.INFO[DATA_FLAG]['n_channels']
N_CLASSES = len(medmnist.INFO[DATA_FLAG]['label'])

# Load model and optimiser.
network = BlackBoxModel(in_channels=N_CHANNELS, num_classes=N_CLASSES)

# Define optimizer and loss function.
optimizer = torch.optim.SGD(network.parameters(), lr=LR, momentum=0.9)
if TASK == "multi-label, binary-class":
    criterion = torch.nn.BCEWithLogitsLoss()
else:
    criterion = torch.nn.CrossEntropyLoss()

# Train the model.
model = train_model(network, optimizer, criterion)

# Evaluate the model.
evaluate_model('train')
evaluate_model('test')

Using downloaded and verified file: /root/.medmnist/dermamnist.npz
Using downloaded and verified file: /root/.medmnist/dermamnist.npz
Using downloaded and verified file: /root/.medmnist/dermamnist.npz

100%|██████████| 110/110 [00:04<00:00, 22.75it/s]
100%|██████████| 110/110 [00:04<00:00, 23.07it/s]
100%|██████████| 110/110 [00:05<00:00, 21.04it/s]
100%|██████████| 110/110 [00:04<00:00, 23.28it/s]
100%|██████████| 110/110 [00:05<00:00, 21.10it/s]
100%|██████████| 110/110 [00:04<00:00, 23.17it/s]
100%|██████████| 110/110 [00:04<00:00, 22.44it/s]
100%|██████████| 110/110 [00:05<00:00, 21.68it/s]
100%|██████████| 110/110 [00:04<00:00, 23.06it/s]
100%|██████████| 110/110 [00:05<00:00, 21.25it/s]

train Model performance  AUC: 0.911  ACC: 0.740
test Model performance  AUC: 0.892  ACC: 0.721

In [59]:

# Prepare a test batch.
nr_samples = 100

x_batch = []
y_batch = []
for i in range(nr_samples):
    x, y = test_dataset[i]
    x_batch.append(np.array(x))
    y_batch.append(np.array(y))

x_batch = np.array(x_batch)
y_batch = np.array(y_batch).reshape(-1)
print(x_batch.shape, y_batch.shape)

(100, 3, 28, 28) (100,)

6.3 Analyse evaluation metric¶

Adapted training methdology as provided in MEDMNIST tutorial.

In [60]:

# Let's list the default parameters of the metric.
quantus.FaithfulnessCorrelation().get_params

Out[60]:

{'abs': False,
 'normalise': True,
 'return_aggregate': True,
 'aggregate_func': <function mean at 0x7870f2d29e30>,
 'normalise_func': <function quantus.functions.normalise_func.normalise_by_max(a: numpy.ndarray, normalise_axes: Optional[Sequence[int]] = None) -> numpy.ndarray>,
 'normalise_func_kwargs': {},
 '_disable_warnings': False,
 '_display_progressbar': False,
 'a_axes': None,
 'similarity_func': <function quantus.functions.similarity_func.correlation_pearson(a: <built-in function array>, b: <built-in function array>, **kwargs) -> float>,
 'nr_runs': 100,
 'subset_size': 224,
 'perturb_func': functools.partial(<function baseline_replacement_by_indices at 0x786f805155a0>, perturb_baseline='black')}

In [62]:

# Define some parameter settings to evaluate.
baseline_strategies = ["mean", "uniform", "black", "white"]
subset_sizes = np.array([1,  10, 25])
sim_funcs = {"pearson": quantus.correlation_pearson, "spearman": quantus.correlation_spearman}

result = {
    "Faithfulness score": [],
    "Method": [],
    "Similarity function": [],
    "Baseline strategy": [],
    "Subset size": [],
}
xai_methods_with_kwargs = {"Saliency": {}, "IntegratedGradients": {"n_steps": 10}, "GradientShap": {}}

In [63]:

# Score explanations!
for b in baseline_strategies:
    for s in subset_sizes:
        for method, kwargs in xai_methods_with_kwargs.items():
            for sim, sim_func in sim_funcs.items():
                metric = quantus.FaithfulnessCorrelation(abs=False,
                                                         normalise=True,
                                                         return_aggregate=True,
                                                         disable_warnings=True,
                                                         aggregate_func=np.mean,
                                                         normalise_func=quantus.normalise_by_negative,
                                                         nr_runs=10,
                                                         perturb_baseline=b,
                                                         perturb_func=quantus.baseline_replacement_by_indices,
                                                         similarity_func=sim_func,
                                                         subset_size=s)
                score = metric(model=model.cuda(),
                               x_batch=x_batch,
                               y_batch=y_batch,
                               a_batch=None,
                               explain_func=quantus.explain,
                               explain_func_kwargs=kwargs,
                               device=device)
                result["Method"].append(method)
                result["Baseline strategy"].append(b.capitalize())
                result["Subset size"].append(s)
                result["Faithfulness score"].append(score[0])
                result["Similarity function"].append(sim)

df = pd.DataFrame(result)
df.head()

Out[63]:

	Faithfulness score	Method	Similarity function	Baseline strategy	Subset size
0	-0.035094	Saliency	pearson	Mean	1
1	0.003210	Saliency	spearman	Mean	1
2	-0.098253	IntegratedGradients	pearson	Mean	1
3	-0.022625	IntegratedGradients	spearman	Mean	1
4	-0.142600	GradientShap	pearson	Mean	1

In [64]:

# Group by the ranking.
df["Rank"] = df.groupby(['Baseline strategy', 'Subset size', 'Similarity function'])["Faithfulness score"].rank()

# Smaller adjustments.
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
df.columns = map(lambda x: str(x).capitalize(), df.columns)
df.head(10)

Out[64]:

	Faithfulness score	Method	Similarity function	Baseline strategy	Subset size	Rank
0	-0.035094	Saliency	pearson	Mean	1	3.0
1	0.003210	Saliency	spearman	Mean	1	3.0
2	-0.098253	IntegratedGradients	pearson	Mean	1	2.0
3	-0.022625	IntegratedGradients	spearman	Mean	1	1.0
4	-0.142600	GradientShap	pearson	Mean	1	1.0
5	-0.002215	GradientShap	spearman	Mean	1	2.0
6	-0.115930	Saliency	pearson	Mean	10	1.0
7	-0.121939	Saliency	spearman	Mean	10	1.0
8	-0.103789	IntegratedGradients	pearson	Mean	10	2.0
9	-0.063273	IntegratedGradients	spearman	Mean	10	3.0

How does this relate back to our (naive) intution about how XAI methods should rank (consistently)?

In [65]:

# Group by rank and calculate percentage.
df_view = df.groupby(["Method"])["Rank"].value_counts(normalize=True).mul(100).reset_index(name='Percentage').round(2)

# Manually adding rows for 'Method A', 'Method B', and 'Method C'.
additional_rows = pd.DataFrame({
    'Method': ['Method A', 'Method B', 'Method C'],
    'Rank': [1.0, 2.0, 3.0],
    'Percentage': [100, 100, 100]
})

# Use pd.concat to append the additional rows.
df_view = pd.concat([df_view, additional_rows], ignore_index=True)

# Preparing the ordered DataFrame.
df_view_ordered = pd.DataFrame({
    'Method': ['Method A', 'Method B', 'Method C'],
    'Rank': [1.0, 2.0, 3.0],
    'Percentage': [100, 100, 100]
})

# Append the other methods based on the existing df_view DataFrame.
for method in xai_methods_with_kwargs:
    df_view_ordered = pd.concat([df_view_ordered, df_view.loc[df_view["Method"] == method]], ignore_index=True)
df_view_ordered

Out[65]:

	Method	Rank	Percentage
0	Method A	1.0	100.00
1	Method B	2.0	100.00
2	Method C	3.0	100.00
3	Saliency	1.0	41.67
4	Saliency	3.0	37.50
5	Saliency	2.0	20.83
6	IntegratedGradients	2.0	41.67
7	IntegratedGradients	3.0	33.33
8	IntegratedGradients	1.0	25.00
9	GradientShap	2.0	37.50
10	GradientShap	1.0	33.33
11	GradientShap	3.0	29.17

In [66]:

# Ensure default matplotlib settings are reset, affecting label visibility.
plt.rcdefaults()

fig, ax = plt.subplots(figsize=(6.5,5))

# Plot results!
ax = sns.histplot(x='Method', hue='Rank', weights='Percentage', multiple='stack', data=df_view_ordered, shrink=0.6, palette="colorblind", legend=False)
ax.spines["right"].set_visible(False)
ax.spines['top'].set_visible(False)
ax.tick_params(axis='both', which='major', labelsize=16)
ax.set_ylabel('Frequency of rank', fontsize=15)
ax.set_xlabel('            Expectation             vs             Reality            ', fontsize=15)
ax.set_xticklabels(["A", "B", "C", "SAL", "GS", "IG"], fontsize=12)
ax.yaxis.set_major_formatter(matplotlib.ticker.PercentFormatter())
plt.legend(loc='upper center', bbox_to_anchor=(0.5, 1.1), ncol=4, fancybox=True, shadow=False, labels=['1st', "2nd", "3rd", "4th"])
plt.axvline(x=2.5, ymax=0.95, color='black', linestyle='-')
plt.tight_layout()

Contrary to intution where ranking is consistent over different metric parameterisations, the ranking significantly differ in the different experimental settings.

In [67]:

import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

7) Bonus! Quantus with Distilbert¶

With a minimal example, we want to demonstrate how to use Quantus with NLP data.

For this purpose we will use huggingface libraries: transformers and datasets.

Another demonstration with Quantus and NLP data can be found here.

In [68]:

import quantus

We answer the following research question:

RQ: How to estimate explanation quality for explanations of text classifiers?

7.1 Load data¶

We have use the SST2 dataset, read more here: https://huggingface.co/datasets/sst2).

In [69]:

from datasets import load_dataset

BATCH_SIZE = 10
dataset = load_dataset("sst2", split="test")
x_batch = dataset['sentence'][BATCH_SIZE:BATCH_SIZE+BATCH_SIZE]
x_batch

Downloading readme:   0%|          | 0.00/5.27k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/3.11M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/72.8k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/148k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/67349 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/872 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1821 [00:00<?, ? examples/s]

Out[69]:

["it 's also heavy-handed and devotes too much time to bigoted views .",
 'it helps that lil bow wow ... tones down his pint-sized gangsta act to play someone who resembles a real kid .',
 'watching the film is like reading a times portrait of grief that keeps shifting focus to the journalist who wrote it .',
 "moore 's performance impresses almost as much as her work with haynes in 1995 's safe .",
 'reinforces the talents of screenwriter charlie kaufman , creator of adaptation and being john malkovich .',
 'now trimmed by about 20 minutes , this lavish three-year-old production has enough grandeur and scale to satisfy as grown-up escapism .',
 'a journey through memory , a celebration of living , and a sobering rumination on fatality , classism , and ignorance .',
 'a remarkable 179-minute meditation on the nature of revolution .',
 'waydowntown is by no means a perfect film , but its boasts a huge charm factor and smacks of originality .',
 "it 's just incredibly dull ."]

7.2 Load models and tokenizers¶

We load pre-trained models and tokenizers with transformers library.

In [70]:

from transformers import AutoModelForSequenceClassification, AutoTokenizer, set_seed

set_seed(42)

# Load the model and tokenizer.
MODEL_NAME = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME).cuda()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

In [72]:

# Load an example.
REFERENCE_TEXT = "The quick brown fox jumps over the lazy dog"
tokenizer(REFERENCE_TEXT, return_tensors="pt")

Out[72]:

{'input_ids': tensor([[  101,  1996,  4248,  2829,  4419, 14523,  2058,  1996, 13971,  3899,
           102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

In [73]:

# @title 7.2.1 Supporting code
%%capture

from torch.utils.data import DataLoader

def preprocess_function(dataset):
    # Tokenize the text.
    return tokenizer(
        dataset["sentence"],
        padding=True,
        truncation=True,
        # max_length=100,
        return_tensors="pt",
    )


# Apply the tokenization to the entire dataset and convert format to PyTorch tensors.
processed_dataset = dataset.map(preprocess_function, batched=True)
processed_dataset.set_format(
    type="torch",
    columns=[
        "input_ids",
        "attention_mask",
        "label",
    ],
)
# Save in data loader.
data_loader = DataLoader(processed_dataset, batch_size=BATCH_SIZE)

for b_ix, batch in enumerate(data_loader):

    # Extract input tensors from the current batch.
    inputs = {
        k: v.to(torch.long).to(device)
        for k, v in batch.items()
        if k in ["input_ids", "attention_mask"]
    }

    # Perform model inference.
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predictions = torch.argmax(logits, dim=1)
    print(predictions)

    if b_ix == 0:
        break

# x_batch = inputs
y_batch = predictions

7.3 Load explanations¶

We create some simple explanation functions below, using captum as a basis (here).

In [74]:

# @title 7.3.1 Supporting code
%%capture

from captum.attr import LayerIntegratedGradients, IntegratedGradients
from IPython.display import display, HTML

def explain_with_layer_ig(model, inputs, targets, **kwargs):
    """Explain with Layer Integrated Gradients."""
    model.eval()
    model.zero_grad()

    layer = kwargs.get("layer")
    tokenizer = kwargs.get("tokenizer")
    ref_token_id = tokenizer.pad_token_id

    def predict(input_ids, attention_mask=None):
        # Special predict func.
        outputs = model(input_ids.to(torch.int), attention_mask=attention_mask)
        return outputs.logits.max(1).values

    explanations = []
    explanations = np.empty(len(x_batch), dtype=object)
    for i, text in enumerate(inputs):

        def construct_input_ref_pair(text, ref_token_id):
            # Construct reference token ids.
            inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
            input_ids = inputs["input_ids"]
            attention_mask = inputs["attention_mask"]
            ref_input_ids = torch.zeros_like(input_ids)
            ref_input_ids[:] = ref_token_id
            return input_ids, ref_input_ids, attention_mask

        # Construct the necessary pairs.
        input_ids, ref_input_ids, attention_mask = construct_input_ref_pair(
            text=text, ref_token_id=ref_token_id
        )

        # Explain with IG.
        lig = LayerIntegratedGradients(predict, eval(layer))
        explanation = lig.attribute(
            inputs=input_ids.to(device),
            #inputs=(input_ids.to(device), attention_mask.to(device)),
            baselines=ref_input_ids.cuda(),
            return_convergence_delta=False,
            additional_forward_args=(attention_mask.to(device),)
        )

        # Sum over the layers.
        if len(explanation.shape) > 2:
            explanation = explanation.sum(dim=2)

        explanations[i] = explanation.squeeze().cpu().numpy()

    return explanations

def colorize_words(text, scores, normalise: bool = False):
    """
    Returns an HTML string with words colorized based on their explanation scores.
    """
    if normalise:
        scores = (scores - np.min(scores)) / (np.max(scores) - np.min(scores))
    cmap = plt.get_cmap("Reds")

    # Get tokens.
    tokens = tokenizer.tokenize(text)

    html_string = "<div style='font-family: Arial;'>"
    for token, score in zip(tokens, scores):
        color = cmap(score)
        hex_color = matplotlib.colors.rgb2hex(color[:3])
        html_string += f"<span style='background-color: {hex_color};'>{token} </span>"
    html_string += "</div>"
    return html_string


def get_label(pred):
    if pred == 0:
        return "Negative"
    return "Positive"

In [75]:

# Generate explanations.
explanations = explain_with_layer_ig(
    model=model,
    inputs=x_batch,
    targets=y_batch,
    **{
        "attention_mask": None,
        "tokenizer": tokenizer,
        "layer": "model.distilbert.embeddings",
    },
)
# Plot!
for text, label, explanation in zip(x_batch, y_batch, explanations):
    html_string = colorize_words(text, explanation, )
    print(f"\nSentiment Prediction = {get_label(label)}")
    display(HTML(html_string))

Sentiment Prediction = Negative

it ' s also heavy - handed and devote ##s too much time to big ##oted views .

Sentiment Prediction = Negative

it helps that lil bow wow . . . tones down his pin ##t - sized gangs ##ta act to play someone who resembles a real kid .

Sentiment Prediction = Positive

watching the film is like reading a times portrait of grief that keeps shifting focus to the journalist who wrote it .

Sentiment Prediction = Positive

moore ' s performance impress ##es almost as much as her work with haynes in 1995 ' s safe .

Sentiment Prediction = Positive

reinforce ##s the talents of screenwriter charlie kaufman , creator of adaptation and being john mal ##kovich .

Sentiment Prediction = Positive

now trimmed by about 20 minutes , this lavish three - year - old production has enough grande ##ur and scale to satisfy as grown - up es ##cap ##ism .

Sentiment Prediction = Negative

a journey through memory , a celebration of living , and a sober ##ing rum ##ination on fatal ##ity , class ##ism , and ignorance .

Sentiment Prediction = Positive

a remarkable 179 - minute meditation on the nature of revolution .

Sentiment Prediction = Positive

way ##down ##town is by no means a perfect film , but its boasts a huge charm factor and smack ##s of original ##ity .

Sentiment Prediction = Negative

it ' s just incredibly dull .

7.4 Evaluate with Quantus¶

For example, we evaluate the complexity of the LayerIntegratedGradients of the first embedding layer model.distilbert.embeddings.

In [76]:

# Pick a metric.
metric = quantus.Sparseness()

In [78]:

scores = []
for expl, input, y, in zip(explanations, x_batch, y_batch):
    score = metric(model, x_batch=np.expand_dims(np.zeros_like(expl), axis=0), y_batch=[y.item()], a_batch=np.expand_dims(expl, axis=0))
    scores.append(score[0])
print(f"Complexity score of LayerIntegratedGradient explanation: {np.mean(scores):.2f} ({np.std(scores):.2f})")

Complexity score of LayerIntegratedGradient explanation: 0.47 (0.07)

6) References¶

Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond by Hedström et al., 2022
[The Meta-Evaluation Problem in Explainable AI:

Identifying Reliable Estimators with MetaQuantus](https://arxiv.org/abs/2302.07265) by Hedström et al., 2023

NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights by Bykov et al, 2022

In [ ]:

Quantus x INVICTA Keynote Series — Explainable AI¶

Table of Contents¶

1) Overview¶

2) Installation prerequisites¶

3) Preliminaries¶

3.1 Load data¶

3.2 Load models¶

3.3 Load explanations¶

4) Introduction to Quantus¶

4.1 XAI metrics¶

4.1.1 Measure complexity¶

4.2 Large-scale evaluation¶

5) Introduction to MetaQuantus¶

5.1 Load estimators¶

5.2 Define test suite¶

5.3 Meta-evaluate methods¶

6) Parameter Sensitivity Analysis¶

6.1 Load vision dataset¶

6.2 Load neural network¶

6.3 Analyse evaluation metric¶

7) Bonus! Quantus with Distilbert¶

7.1 Load data¶

7.2 Load models and tokenizers¶

7.3 Load explanations¶

7.4 Evaluate with Quantus¶

6) References¶