This notebook demonstrates how you can find adversarial examples for a pre-trained example network on the MNIST dataset.
We suggest having the Gurobi
solver installed, since its performance is significantly faster. If this is not possible, the Cbc
solver is another option.
The Images
package is only necessary for visualizing the sample images.
using MIPVerify
using Gurobi
using Images
mnist = MIPVerify.read_datasets("MNIST")
mnist.train
size(mnist.train.images)
mnist.train.labels
We import a sample pre-trained neural network.
n1 = MIPVerify.get_example_network_params("MNIST.n1")
MIPVerify.frac_correct
allows us to verify that the network has a reasonable accuracy on the test set of 96.95%. (This step is crucial when working with your own neural net parameters; since the training is done outside of Julia, a common mistake is to transfer the parameters incorrectly.)
MIPVerify.frac_correct(n1, mnist.test, 10000)
We feed the first image into the neural net, obtaining the activations of the final softmax layer.
Note that the image must be specified as a 4-dimensional array with size (1, height, width, num_channels)
. We provide a helper function MIPVerify.get_image
that extracts the image from the dataset while preserving all four dimensions.
sample_image = MIPVerify.get_image(mnist.test.images, 1)
output_activations = sample_image |> n1
The category that has the largest activation is category 8, corresponding to a label of 7.
(output_activations |> MIPVerify.get_max_index) - 1
This matches the true label.
MIPVerify.get_label(mnist.test.labels, 1)
We now try to find the closest $L_infty$ norm adversarial example to the first image, setting the target category as index 10
(corresponding to a true label of 9). Note that we restrict the search space to a distance of 0.05
around the original image via the specified pp
.
target_label_index = 10
d = MIPVerify.find_adversarial_example(
n1,
sample_image,
target_label_index,
Gurobi.Optimizer,
Dict(),
norm_order = Inf,
pp=MIPVerify.LInfNormBoundedPerturbationFamily(0.05)
)
using JuMP
perturbed_sample_image = JuMP.value.(d[:PerturbedInput])
As a sanity check, we feed the perturbed image into the neural net and inspect the activation in the final layer. We verify that the perturbed image does maximize the activation of the target label index, which is 10.
perturbed_sample_image |> n1
We visualize the perturbed image and compare it to the original image. Since we are minimizing the $L_infty$-norm, changes are made to many pixels but the change to each pixels is not very noticeable.
colorview(Gray, perturbed_sample_image[1, :, :, 1])
colorview(Gray, sample_image[1, :, :, 1])
That concludes this quickstart! The next tutorial will introduce you to each of the layers, and show how you can import your own neural network parameters.