Exercise 4.3 - Solution

Classification

In the following tasks, we will repeatedly use some basic functions (e.g., the softmax function or the cross-entropy) of the Keras Library. To familiarize with them, we will implement the most important of them ourselves in this task.

Suppose we want to classify some data (4 samples) into 3 distinct classes: 0, 1, and 2. We have set up a network with a pre-activation output z in the last layer. Applying softmax will give the final model output.

input X ---> some network --> z
--> y_model = softmax(z)

We quantify the agreement between truth (y) and model using categorical cross-entropy.

$$J = - \sum_i (y_i * \log(y_\mathrm{model}(x_i))$$

In the following you are to implement softmax and categorical cross-entropy and evaluate them values given the values for z.

In [1]:
import numpy as np
Data: 4 samples with the following class labels (input features X irrelevant here)
In [2]:
y_cl = np.array([0, 0, 2, 1])
output of the last network layer before applying softmax
In [3]:
z = np.array([
    [4,    5,    1],
    [-1,  -2,   -3],
    [0.1, 0.2, 0.3],
    [-1,  17,    1]
    ]).astype(np.float32)

Task 1)

Write a function that turns any class labels y_cl into one-hot encodings y.

0 --> (1, 0, 0)

1 --> (0, 1, 0)

2 --> (0, 0, 1)

Make sure that np.shape(y) = (4, 3) for np.shape(y_cl) = (4).

In [4]:
def to_onehot(y_cl, num_classes):
    y = np.zeros((len(y_cl), num_classes))
    y[np.arange(4), y_cl] = 1
    return y

y = to_onehot(y_cl, num_classes=3)
print('one-hot encoding of data labels')
print(y)
one-hot encoding of data labels
[[1. 0. 0.]
 [1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]

Task 2)

Write a function that returns the softmax of the input z along the last axis

In [5]:
def softmax(z):
    expz = np.exp(z).T
    return (expz / np.sum(expz, axis=0)).T

y_model = softmax(z)
print('softmax(z)')
print(y_model)
softmax(z)
[[2.6538792e-01 7.2139925e-01 1.3212887e-02]
 [6.6524100e-01 2.4472848e-01 9.0030573e-02]
 [3.0060962e-01 3.3222499e-01 3.6716542e-01]
 [1.5229979e-08 9.9999994e-01 1.1253517e-07]]

Task 3)

Compute the categorical cross-entropy between data and model

In [6]:
crossentropy = -np.mean(np.sum(y * np.log(y_model), axis=1))
crossentropy = -np.mean(np.log(y_model[np.arange(4), y_cl]))  # alternative formulation
print('cross entropy = %f' % crossentropy)
cross entropy = 0.684028

Task 4)

Determine which calsses are predicted by the model (maximum prediction)

In [7]:
y_model_cl = np.argmax(y_model, axis=1)
print('\ntrue class labels = ', y_cl)
print('predicted class labels =', y_model_cl)
true class labels =  [0 0 2 1]
predicted class labels = [1 0 2 1]

Task 5)

Estimate how many samples are classified correctly (accuracy)

In [8]:
accuracy = np.mean(y_model_cl == y_cl)
print('accuracy = %.2f' % accuracy)
accuracy = 0.75