Assess predictions on binary text classification blbooksgenre data with a huggingface transformers model¶

This notebook demonstrates the use of the responsibleai API to assess a text classification huggingface transformers model trained on the blbooksgenre dataset (see https://huggingface.co/datasets/blbooksgenre for more information about the dataset). It walks through the API calls necessary to create a widget with model analysis insights, then guides a visual analysis of the model.

Launch Responsible AI Toolbox
- Load Model and Data
- Create Model and Data Insights

Launch Responsible AI Toolbox¶

The following section examines the code necessary to create datasets and a model. It then generates insights using the responsibleai API that can be visually analyzed.

Load Model and Data¶

The following section can be skipped. It loads a dataset and trains a model for illustrative purposes.

First we import all necessary dependencies

In [ ]:

import datasets
import pandas as pd
import zipfile
from sklearn.model_selection import train_test_split
from transformers import (AutoModelForSequenceClassification, AutoTokenizer,
                          pipeline)

from raiutils.common.retries import retry_function

try:
    from urllib import urlretrieve
except ImportError:
    from urllib.request import urlretrieve

Next we load the blbooksgenre dataset from huggingface datasets

In [ ]:

NUM_TEST_SAMPLES = 20

def load_dataset(split):
    config_kwargs = {"name": "title_genre_classifiction"}
    dataset = datasets.load_dataset("blbooksgenre", split=split, **config_kwargs)
    return pd.DataFrame({"text": dataset["title"], "label": dataset["label"]})

pd_data = load_dataset("train")

pd_data, pd_valid_data = train_test_split(
    pd_data, test_size=0.2, random_state=0)

START_INDEX = 0
train_data = pd_data[NUM_TEST_SAMPLES:].reset_index(drop=True)
test_data = pd_valid_data[:NUM_TEST_SAMPLES].reset_index(drop=True)

Fetch a pre-trained huggingface model on the blbooksgenre dataset

In [ ]:

BLBOOKSGENRE_MODEL_NAME = "blbooksgenre_model"
NUM_LABELS = 2

class FetchModel(object):
    def __init__(self):
        pass

    def fetch(self):
        zipfilename = BLBOOKSGENRE_MODEL_NAME + '.zip'
        url = ('https://publictestdatasets.blob.core.windows.net/models/' +
               BLBOOKSGENRE_MODEL_NAME + '.zip')
        urlretrieve(url, zipfilename)
        with zipfile.ZipFile(zipfilename, 'r') as unzip:
            unzip.extractall(BLBOOKSGENRE_MODEL_NAME)

def retrieve_blbooksgenre_model():
    fetcher = FetchModel()
    action_name = "Model download"
    err_msg = "Failed to download model"
    max_retries = 4
    retry_delay = 60
    retry_function(fetcher.fetch, action_name, err_msg,
                   max_retries=max_retries,
                   retry_delay=retry_delay)
    model = AutoModelForSequenceClassification.from_pretrained(
        BLBOOKSGENRE_MODEL_NAME, num_labels=NUM_LABELS)
    return model

model = retrieve_blbooksgenre_model()

Load the model and tokenizer

In [ ]:

# load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

device = -1
if device >= 0:
    model = model.cuda()

# build a pipeline object to do predictions
pred = pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    device=device,
    return_all_scores=True
)

In [ ]:

from ml_wrappers import wrap_model
wrapped_model = wrap_model(pred, test_data, 'text_classification')

In [ ]:

print("number of errors on test dataset: " + str(sum(wrapped_model.predict(test_data['text'].tolist()) != test_data['label'].tolist())))

In [ ]:

classes = train_data["label"].unique()
classes.sort()

Create Model and Data Insights¶

In [ ]:

from responsibleai_text import RAITextInsights, ModelTask
from raiwidgets import ResponsibleAIDashboard

To use Responsible AI Dashboard, initialize a RAITextInsights object upon which different components can be loaded.

RAITextInsights accepts the model, the test dataset, the classes and the task type as its arguments.

In [ ]:

rai_insights = RAITextInsights(pred, test_data,
                               "label",
                               task_type=ModelTask.TEXT_CLASSIFICATION,
                               classes=classes)

Add the components of the toolbox for model assessment.

In [ ]:

rai_insights.explainer.add()
rai_insights.error_analysis.add()

Once all the desired components have been loaded, compute insights on the test set.

In [ ]:

rai_insights.compute()

Finally, visualize and explore the model insights. Use the resulting widget or follow the link to view this in a new tab.

In [ ]:

ResponsibleAIDashboard(rai_insights)