Lab 1 : Neural networks on MNIST database¶

1) The MNIST database¶

The problem addressed by MNIST data is a supervised learning problem, and more precisely a classification problem. The objective is to build a model that can recognize numbers from images. The training datasets contains 60000 example of images (arrays 28*28) which makes 784 features. The testing datasets contains 10000 examples. The evaluation criteria used is the test error rate (ie the rate of missclassified images).

So far, the best performance reported for a fully connected neural networks is 0.35%. Better results have been reached with convolutional neural networks: 0.23 %. This is the best performance so far.

In [1]:

from keras.datasets import mnist
import numpy as np

# Load data through keras library
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

Using TensorFlow backend.
/home/Anaconda/anaconda3/envs/tensorflow/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
  return f(*args, **kwds)

In [2]:

print(X_train.shape)
print(X_test.shape)

(60000, 28, 28)
(10000, 28, 28)

Here is an overview of the dataset (first 20 images)

In [3]:

import matplotlib.pyplot as plt

w = 10
h = 10
fig = plt.figure(figsize=(8, 8))
columns = 4
rows = 5
for i in range(0, columns*rows):
    img = np.random.randint(10, size=(h,w))
    fig.add_subplot(rows, columns, i+1)
    plt.imshow(X_train[i])
plt.show()

<matplotlib.figure.Figure at 0x7f5ba7715128>

Here we can check the corresponding labels

In [4]:

Y_train[:20]

Out[4]:

array([5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5, 3, 6, 1, 7, 2, 8, 6, 9],
      dtype=uint8)

In [5]:

# Reshape training and testing data
X_train = X_train.reshape(X_train.shape[0], 28*28)
X_test = X_test.reshape(X_test.shape[0], 28*28)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

In [6]:

# Reshape labels
from keras.utils import np_utils
Y_train = np_utils.to_categorical(Y_train, 10)
Y_test = np_utils.to_categorical(Y_test, 10)

2) Neural Networks¶

Now the dataset is ready to process. We'll use keras library to learn and test neural networks with diferent sets of parameters.

In [7]:

# Import modules
from keras.models import Sequential, Model
from keras.layers import Dense, Activation, Input

Here are the parameters chosen

In [8]:

# Parameters
activation = 'relu'  # activation function
optimizer = 'adam'  # optimization
loss = 'categorical_crossentropy' # Cost function
batch_size = 30 # Batch size 
epochs = 5 # Epochs

In [9]:

# Initialize model
model = Sequential()

# Define NN structure
inputs = Input(shape=(784,))
x = Dense(200, activation=activation)(inputs)    # First hidden layer with 200 nodes
x = Dense(200, activation=activation)(x)         # Second hidden layer with 200 nodes
predictions = Dense(10, activation='softmax')(x)

model = Model(inputs=inputs, outputs=predictions)

In [10]:

model.compile(optimizer=optimizer, loss=loss,metrics=['accuracy'])   # optimization parameters + cost function

In [11]:

import time
t = time.time()
# Train model
model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size)
t = time.time() - t
print("Training time : " + str(t) + " sec")

Epoch 1/5
60000/60000 [==============================] - 19s 324us/step - loss: 0.2083 - acc: 0.9375
Epoch 2/5
60000/60000 [==============================] - 19s 311us/step - loss: 0.0897 - acc: 0.9720
Epoch 3/5
60000/60000 [==============================] - 19s 320us/step - loss: 0.0610 - acc: 0.9801
Epoch 4/5
60000/60000 [==============================] - 19s 309us/step - loss: 0.0459 - acc: 0.9852
Epoch 5/5
60000/60000 [==============================] - 18s 298us/step - loss: 0.0368 - acc: 0.9880
Training time : 94.06952619552612 sec

In [12]:

# Testing the model
score = model.evaluate(X_test, Y_test, batch_size=10)
print(score)
print("Accuracy : " + str(score[1]))
print("Test error rate : " + str(1-score[1]))

10000/10000 [==============================] - 3s 321us/step
[0.07905915418734595, 0.978799996137619]
Accuracy : 0.978799996137619
Test error rate : 0.021200003862380967

With this set of parameters, we find a test error rate approximately equal to 2% which is better than the 3% needed in the assignement. Objective completed !

In [ ]: