#!/usr/bin/env python # coding: utf-8 # In[1]: get_ipython().run_line_magic('reload_ext', 'autoreload') get_ipython().run_line_magic('autoreload', '2') get_ipython().run_line_magic('matplotlib', 'inline') import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"; os.environ["CUDA_VISIBLE_DEVICES"]="0" # # Explainable AI in *ktrain* # # Deep neural networks are sometimes called "black boxes", as it is not always clear how such models are using data to make a decision or prediction. **Explainable AI** (XAI) involves methods and techniques to help understand how an AI model reach particular conclusions. Although XAI is an [open and actively researched problem](https://www.darpa.mil/program/explainable-artificial-intelligence), there are some existing methods that can be practically applied now. # # In previous tutorials, we saw that, for both image classification and text classification, we can invoke the `ktrain.get_predictor` function and obtain a `Predictor` object to make predictions on new raw data. For instance, with text data, one can easily make predictions from the raw and unprocessed text of a document as follows: # # ``` # predictor = ktrain.get_predictor(learner.model, preproc=preproc) # predictor.predict(document_text) # ``` # # In this notebook, we show how one can invoke the `explain` method of `Predictor` objects to help understand how those predictions were made. This is particularly useful in understanding **misclassifications**. We start with image classification. # ## Explaining Image Classification # Let's begin by using *ktrain* to train an image classifier for a single epoch on the publicly available [Kaggle Dogs vs. Cats dataset](https://www.kaggle.com/c/dogs-vs-cats), as we did in the [tutorial 3 notebook](https://github.com/amaiya/ktrain/blob/master/tutorials/tutorial-03-image-classification.ipynb). # In[2]: # imports import ktrain from ktrain import vision as vis # STEP 1: load data (train_data, val_data, preproc) = vis.images_from_folder( datadir='data/dogscats', data_aug = vis.get_data_aug(horizontal_flip=True), train_test_names=['train', 'valid'], target_size=(224,224), color_mode='rgb') # STEP 2: define model model = vis.image_classifier('pretrained_resnet50', train_data, val_data, freeze_layers=15) learner = ktrain.get_learner(model=model, train_data=train_data, val_data=val_data, workers=8, use_multiprocessing=False, batch_size=64) # STEP 3: train learner.fit_onecycle(1e-4, 1) # The `view_top_losses` method in *ktrain* identifies those examples in the validation set that were misclassified and sorts them by loss such that examples at the top of the list are the most severely misclassified. Let's invoke `view_top_losses` to see the top 3 most misclassified images by our model. # In[5]: learner.view_top_losses(n=1, preproc=preproc) # The top most misclassified image depicts both a dog and a cat but is labeled as belonging to only the **cats** category. Our classifier's prediction for this image is **dogs** probably because it is focusing mostly on the dog in this image. This can be verified by invoking the `explain` method. # # In[6]: predictor = ktrain.get_predictor(learner.model, preproc) predictor.explain('data/dogscats/valid/cats/cat.5583.jpg') # The `explain` method displays the image and highlights the area on which the classifier focuses to make the prediction. As expected, our model is focusing on the dog in this picture when predicting the label **dogs**. In this case, **Explainable AI** was not really necessary to understand why our classifier generated its prediction. However, such techniques can be helpful in other cases where the decision-making process is not as evident. # # Consider this image in the validation set, for example: # In[7]: vis.show_image('data/dogscats/valid/cats/cat.92.jpg') # The image (i.e., *cat.92.jpg*) is interesting in that there are no real dogs or cats in the picture. Nevertheless, the image has a ground truth label of **cats**. Perhaps this is because there is a stuffed cat in the photo. There is, however, also a cartoon dog in the background. # # Our classifier is understandably confused and placed the picture in the **dogs** category. # In[8]: predictor.predict_filename('data/dogscats/valid/cats/cat.92.jpg') # But, **why** has our classifier predicted **dogs** for this photo? Is it mistaking the stuffed cat for a dog or is it focusing on the cartoon dog in the background? We can find the answer by creating a `Predictor` object and invoking the `explain` method. # In[9]: predictor.explain('data/dogscats/valid/cats/cat.92.jpg') # It is clear from the visualization that our classifier is focusing on the *stuffed cat* and mistaking it for a dog. Moreover, it appears to be particuarly to focused on the ears of the stuffed animal, which do sort of look like dog ears. # # These visualizations are based on the [Grad-CAM technique](https://arxiv.org/abs/1610.02391) and is supported in *ktrain* via the [eli5](https://github.com/TeamHG-Memex/eli5) library. # ## Explaining Text Classification # # We can also visualize which words **text classifiers** tend to focus on when making predictions. As before, let's use *ktrain* to quickly build a text classifier to classify [IMDb movie reivews](https://ai.stanford.edu/~amaas/data/sentiment/) as positive or negative. # In[10]: # imports import ktrain from ktrain import text # STEP 1: load and preprocess text data (x_train, y_train), (x_test, y_test), preproc = text.texts_from_folder('data/aclImdb', max_features=20000, maxlen=400, ngram_range=1, train_test_names=['train', 'test'], classes=['pos', 'neg']) # STEP 2: define a Keras text classification model from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Embedding, GlobalAveragePooling1D model = Sequential() model.add(Embedding(20000+1, 50, input_length=400)) # add 1 for padding token model.add(GlobalAveragePooling1D()) model.add(Dense(2, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) learner = ktrain.get_learner(model, train_data=(x_train, y_train), val_data=(x_test, y_test)) # STEP 3: train learner.autofit(0.005, 2) # As we did above, we invoke `view_top_losses` to view the most misclassified review in the validation set. # In[11]: learner.view_top_losses(n=1, preproc=preproc) # The review is evidently about a movie starring Mickey Rourke, but it is difficult to follow. This is partly because the text displayed is post-processed in that punctuation and rare words are omitted (including the actual title of the movie!). Let's use the ID of this document (i.e., id=8244 as shown above) to retrieve the original text and supply that as input to `predictor.explain`. # In[10]: from sklearn.datasets import load_files test_b = load_files(os.path.join('data/aclImdb', 'test'), shuffle=False, categories=['neg', 'pos']) doc = test_b.data[8244].decode('utf-8') # In[11]: predictor = ktrain.get_predictor(learner.model, preproc) # **NOTE:** Output of `explain` below does not render correctly through GitHub's notebook viewer (colors are . View directly through Jupyter notebook for proper display. # In[12]: predictor.explain(doc) # The visualization is generated using a technique called [LIME](https://arxiv.org/abs/1602.04938). The input is randomly perturbed to examine how the prediction changes. This is used to infer the relative importance of different words to the final prediction using a linear interpretable model. The GREEN words contribute to the incorrect classification. The RED (and PINK) words detract from our final prediction. (Shade of color denotes the strength or size of the coefficients in the inferred linear model.) By examining the GREEN words, we can see that the review is overall positive despite being given a ground truth label of negative. In fact, the last line recommends this movie as a rental. The review is also odd in that it spends a lot of time praising past performances of the actors in previous movies. The PINK words show us that there is not very much there in terms of negative words that would lead us to conclude that this is a negative review. For these reasons, we can probably forgive our simple model for misclassifying this particular review as positive, as many humans would probably classify this as positive, as well. # ## Explaining Tabular Models # # Let's train a model to predict **Survival** using [Kaggle's Titatnic dataset](https://www.kaggle.com/c/titanic). # # After training the model, we will **explain** the model's prediction for a specific example. # In[2]: import pandas as pd import numpy as np train_df = pd.read_csv('data/titanic/train.csv', index_col=0) train_df = train_df.drop('Name', 1) train_df = train_df.drop('Ticket', 1) train_df = train_df.drop('Cabin', 1) np.random.seed(42) p = 0.1 # 10% for test set prop = 1-p df = train_df.copy() msk = np.random.rand(len(df)) < prop train_df = df[msk] test_df = df[~msk] import ktrain from ktrain import tabular trn, val, preproc = tabular.tabular_from_df(train_df, label_columns=['Survived'], random_state=42) model = tabular.tabular_classifier('mlp', trn) learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=32) learner.fit_onecycle(1e-3, 25) # In[3]: predictor = ktrain.get_predictor(learner.model, preproc) preds = predictor.predict(test_df, return_proba=True) df = test_df.copy()[[c for c in test_df.columns.values if c != 'Survived']] df['Survived'] = test_df['Survived'] df['predicted_Survived'] = np.argmax(preds, axis=1) df.head() # Let's explain the male passenger with PassengerID=5 using the [shape](https://github.com/slundberg/shap) library. # In[4]: predictor.explain(test_df, row_index=35, class_id=1) # From the visualization above, we can see that his First class status (`Pclass=1`) and his higher-than-average Fare price (suggesting that he i is wealthy) are pushing the model higher towards predicting **Survived**. At the same time, the fact that he is a `Male` pushes the model to lower its prediction towards **NOT Survived**. For these reasons, this is a border-line and uncertain prediction. # # For more information, see the [tutorial notebook on tabular models](https://nbviewer.jupyter.org/github/amaiya/ktrain/blob/master/tutorials/tutorial-08-tabular_classification_and_regression.ipynb). # In[ ]: