Additional readings (To go further):

- Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning
- Saharon Rosset, Ji Zhu and Trevor Hastie, Margin Maximizing Loss Functions

The assignment is divided into three parts. In the first part, we will go back to neural networks. You will be asked to build and train a convolutional neural network for image classification. In the second part, we will focus on the max margin classifier and study how such a classifier can be learned by means of gradient descent. Finally, in the last part, we will implement a principal component decomposition of a video sequence to extract moving targets from their background.

In this first question, we will use the Keras API to build and train a convolutional neural network to discriminate between four types of road signs. To simplify we will consider the detection of 4 different signs:

- A '30 km/h' sign (folder 1)
- A 'Stop' sign
- A 'Go straight' sign
- A 'Keep left' sign

An example of each sign is given below.

In [2]:

```
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img1 = mpimg.imread('1/00001_00000_00012.png')
plt.subplot(141)
plt.imshow(img1)
plt.axis('off')
plt.subplot(142)
img2 = mpimg.imread('2/00014_00001_00019.png')
plt.imshow(img2)
plt.axis('off')
plt.subplot(143)
img3 = mpimg.imread('3/00035_00008_00023.png')
plt.imshow(img3)
plt.axis('off')
plt.subplot(144)
img4 = mpimg.imread('4/00039_00000_00029.png')
plt.imshow(img4)
plt.axis('off')
plt.show()
```

In this exercise, you need to build and train a convolutional neural network to discriminate between the four images.

Before building the network, you should start by cropping the images so that they all have a common predefined size (take the smallest size across all images)

We will use a

**Sequential model**from Keras but it will be up to you to define the structure of the convolution net. Initialization of the sequential model can be done with the following line

In [ ]:

```
model = Sequential()
```

- We will use a
**convolutional**architecture. you can add convolutional layers to the model by using the following lines

In [ ]:

```
model.add(Conv2D(num_units, (filter_size1, filter_size2), padding='same',
input_shape=(3, IMG_SIZE, IMG_SIZE),
activation='relu'))
```

for the first layer and

In [ ]:

```
model.add(Conv2D(filters, filter_size, activation, input_shape)
```

On top of the convolutional layers, convolutional neural networks (CNN) also often rely on **Pooling layers**. The addition of such a layer can be done through the following line

In [ ]:

```
model.add(MaxPooling2D(pool_size=(filter_sz1, filter_sz2),strides=None))
```

The *pooling layers* usually come with two parameters: the 'pool size' and the 'stride' parameter. The basic choice for the pool size is (2,2) and the stride is usually set to None (which means it will split the image into non overlapping regions such as in the Figure below). You should however feel free to play a little with those parameters. The **MaxPool operator** considers a mask of size 'pool_size' which is slided over the image by a number of pixels equal to the stride parameters (in x and y, there are hence two translation parameters). for each position of the mask, the output only retains the max of the pixels appearing in the mask (This idea is illustrated below). One way to understand the effect of the pooling operator is that if the filter detects an edge in a subregion of the image (thus returning at least one large value), although a MaxPooling will reduce the number of parameters, it will keep track of this information.

Adding 'Maxpooling' layers is known to work well in practice.

Once you have stacked the convolutional and pooling layers, you should flatten the output through a line of the form

In [ ]:

```
model.add(Flatten())
```

In [ ]:

```
model.add(Dense(num_units, activation='relu'))
```

Since there are four possible signs, you need to **finish your network with a dense layer with 4 units**. Each of those units should output four number between 0 and 1 representing the likelihood that any of the four signs is detected and such that $p_1 + p_2 + p_3 + p_4 = 1$ (hopefully with one probability much larger than the others). For this reason, a good choice for the **final activation function** of those four units is the **softmax** (Why?).

Build your model below.

In [ ]:

```
model = Sequential()
# construct the model using convolutional layers, dense fully connected layers and
```

Once you have found a good architecture for your network, split the dataset, by retaining about 90% of the images for training and 10% of each folder for test. To train your network in Keras, we need two more steps. The first step is to set up the optimizer. Here again it is a little bit up to you to decide how you want to set up the optimization. Two popular approaches are **SGD and ADAM**. You will get to choose the learning rate. This rate should however be between 1e-3 and 1e-2. Once you have set up the optimizer, we need to set up the optimization parameters. This includes the loss (we will take it to be the **categorical cross entropy** which is the extension of the log loss to the multiclass problem).

In [ ]:

```
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.optimizers import Adam
# set up the optimize here
# Myoptimizer = SGD
# Myoptimizer = Adam
model.compile(loss='categorical_crossentropy',
optimizer=Myoptimizer,
metrics=['accuracy'])
```

The last step is to fit the network to your data. Just as any function in scikit-learn, we use a call to the function 'fit'. The training of neural networks can be done by splitting the dataset into minibatches and using a different batch at each SGD step. This process is repeated over the whole dataset. A complete screening of the dataset is called an epoch. We can then repeat this idea several times. In keras the number of epochs is stored in the 'epochs' parameter and the batch size is stored in the 'batch_size' parameter.

In [ ]:

```
batch_size = 32
epochs = 30
model.fit(X, t,batch_size=batch_size,epochs=epochs, validation_split=0.2)
```

Consider the dataset below. We would like to learn a classifier for this dataset that maximizes the margin (i.e. such that the distance between the closest points to the classifier is maximized). We have seen that one can solve this problem by means of the constrained formulation

\begin{align*} \min_{\mathbf{\beta}} \quad & \|\mathbf{\beta}\|^2 \\ \text{subject to} \quad & y(\mathbf{x}^{(i)})t^{(i)} \geq 1 \end{align*}where $y(\mathbf{x}^{(i)}) = \mathbf{\beta}^T\mathbf{x}^{(i)} + \beta_0$. We might sometimes want to use a (softer) unconstrained formulation. in particular, when selecting this option, we can use the following function known as the *Hinge loss*

For such a loss, we can derive a softer, unconstrained version of the problem as

\begin{align*} \min_{\mathbf{\beta}} \quad & \|\mathbf{\beta}\|^2 + \frac{C}{N}\sum_{i=1}^N \max(0, 1-t^{(i)}(\mathbf{\beta}^T\mathbf{x}^{(i)}+\beta_0)) \end{align*}In short we penalize a point, only if this point lies on the wrong side of the plane.

In [2]:

```
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import loadmat
pointsClass1 = loadmat('KernelPointsEx4class1.mat')['PointsEx4class1']
pointsClass2 = loadmat('KernelPointsEx4class2.mat')['PointsEx4class2']
plt.scatter(pointsClass1[:,0], pointsClass1[:,1], c='r')
plt.scatter(pointsClass2[:,0], pointsClass2[:,1], c='b')
plt.show()
```

Start by completing the function below which should return the value and gradient of the hinge loss at a point $\mathbf{x}^{(i)}$. What is the gradient of the hinge loss?

In [ ]:

```
def HingeLoss(x):
'''Returns the value and gradient of the hinge
loss at the point x'''
return value, gradient
```

Once you have the function, implement a function HingeLossSVC that takes as innput a starting weight vector $\mathbf{\beta}$ and intercept $\beta_0$ as well as the set of training points and a value for the parameter $C$ and returns the maximum margin classifier.

In [ ]:

```
def HingeLossSVC(beta_init, beta0_init training, C):
'''Returns the maximal margin classifier for the
training dataset'''
return beta, beta0
```

Upload a picture of yourself (possibly downsampled) and apply a Kmeans segmentation in the RGB space for a few distinct numbers of centroids (e.g. 5, 10, 20).

In [ ]:

```
```