Digit Classification with Neural Networks¶

About this project¶

Our goal with this notebook is to build a neural network classifier in Keras to recognize hand-written digits. We will use the MNIST data set that consists of thousands of hand writtent numbers that span 0-9

Similar to the process shown below:

Import Libraries¶

Here we import the libraries that we will be using throughout the notebook.

The libraries that we will be using include:

Numpy
Keras
Matplotlib <-for plotting and visualizations

In [1]:

from keras.datasets import mnist
from keras.preprocessing.image import load_img, array_to_img
from keras.utils.np_utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.

Load the data¶

In [2]:

(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step

In [3]:

print(type(X_train))
print(X_train.shape)
print(y_train.shape) #60k is the answers
print(X_test.shape)  #10K entries
print(y_test.shape)

<class 'numpy.ndarray'>
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)

Understanding the image data format¶

In [4]:

#Lets look at the data to see what it looks like
print(X_train[0].shape) #Look at the size of the 1st entry

#Plot it to see what it looks like

plt.imshow(X_train[0])

#Print the answer
print("The answer is {}".format(y_train[0]))

(28, 28)
The answer is 5

Preprocessing the image data¶

In [ ]:

image_height, image_width =28, 28

In [ ]:

#Lets reshape each image to be a single vector rather than a matrix

#Have to flatten to plug into neural net

X_train  =X_train.reshape(60000,image_height*image_width)

X_test   =X_test.reshape(10000,image_height*image_width)

In [7]:

print(X_train.shape) #28X28 =784
print(X_test.shape)

(60000, 784)
(10000, 784)

In [8]:

#Check to see if image is between 0-255
print(min(X_train[0]), max(X_train[0])) #it is! so we need to normalize

#We will convert data to float (insead of int) to scale the data betwn 0-1 (not 0-255)

X_train = X_train.astype('float32') #Convert to float
X_test  = X_test.astype('float32') #Convert to float

0 255

In [9]:

#Normalize the data
X_train /= 255.0
X_test  /= 255.0
print(min(X_train[0]), max(X_train[0])) #Normalized

0.0 1.0

In [10]:

# We want the output to be in one of 9 bins to rep each of the 0-9 numbers
#In order to do this we can convert the answers to a categorical value
#We do this using the 'to_categorical' method

y_train =to_categorical(y_train, 10)
y_test  =to_categorical(y_test, 10)
print(y_train.shape)
print(y_test.shape)

(60000, 10)
(10000, 10)

In [11]:

print(y_test[0])
plt.imshow(X_test[0].reshape(image_height, image_width))

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]

Out[11]:

<matplotlib.image.AxesImage at 0x7fb01975ee80>

Build a model¶

In [12]:

#Assign the model type
model = Sequential()

WARNING: Logging before flag parsing goes to stderr.
W0820 07:17:37.393794 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

In [13]:

#Add layers to the model

model.add(Dense(512, activation='relu',input_shape=(784,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))

W0820 07:17:37.444703 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0820 07:17:37.461894 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

Compile the model¶

In [14]:

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

W0820 07:17:37.521929 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0820 07:17:37.561933 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.

In [15]:

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________

Calculating the number of parameters for each layer¶

Layer 1¶

After flattening each image we get:
- 28 X 28=784
We then pass the 784 into 512 nodes in the model plus a bias layer 512 (zeros)
This gives:
- 784(pixels) X 512(neurons) X 512(bias)=401920

Layer 2¶

We have 512 (output from previous), going into another 512 nodes (in new layer), plus another 512
This gives:
- 512 (input) X 512 (this layer) X 512 =262656

Layer 3¶

We have 512 (incoming from last layer), going into 10 nodes (in this layer), 10 bias units
This gives:
- 512 (last layer) X 10 (nodes in this layer) + 10 (bias) =5130

Train the model¶

Now we can train our model. To do this we have to pass:

Training data
Number of epochs (the number of times that model passes through the training data)
Validation data (testing data)

In [16]:

history =model.fit(X_train, y_train, epochs =20, validation_data=(X_test, y_test))

W0820 07:17:37.736055 140395328268160 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0820 07:17:37.796303 140395328268160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 24s 394us/step - loss: 0.1828 - acc: 0.9440 - val_loss: 0.0929 - val_acc: 0.9705
Epoch 2/20
60000/60000 [==============================] - 23s 386us/step - loss: 0.0808 - acc: 0.9757 - val_loss: 0.0827 - val_acc: 0.9743
Epoch 3/20
60000/60000 [==============================] - 23s 389us/step - loss: 0.0565 - acc: 0.9828 - val_loss: 0.0695 - val_acc: 0.9786
Epoch 4/20
60000/60000 [==============================] - 23s 381us/step - loss: 0.0429 - acc: 0.9864 - val_loss: 0.0832 - val_acc: 0.9774
Epoch 5/20
60000/60000 [==============================] - 23s 386us/step - loss: 0.0353 - acc: 0.9887 - val_loss: 0.0921 - val_acc: 0.9745
Epoch 6/20
60000/60000 [==============================] - 23s 387us/step - loss: 0.0287 - acc: 0.9910 - val_loss: 0.0819 - val_acc: 0.9782
Epoch 7/20
60000/60000 [==============================] - 23s 387us/step - loss: 0.0272 - acc: 0.9914 - val_loss: 0.0807 - val_acc: 0.9802
Epoch 8/20
60000/60000 [==============================] - 24s 395us/step - loss: 0.0237 - acc: 0.9924 - val_loss: 0.1136 - val_acc: 0.9771
Epoch 9/20
60000/60000 [==============================] - 23s 390us/step - loss: 0.0201 - acc: 0.9938 - val_loss: 0.1083 - val_acc: 0.9800
Epoch 10/20
60000/60000 [==============================] - 23s 381us/step - loss: 0.0202 - acc: 0.9939 - val_loss: 0.1016 - val_acc: 0.9798
Epoch 11/20
60000/60000 [==============================] - 23s 390us/step - loss: 0.0170 - acc: 0.9951 - val_loss: 0.1167 - val_acc: 0.9783
Epoch 12/20
60000/60000 [==============================] - 23s 387us/step - loss: 0.0175 - acc: 0.9948 - val_loss: 0.1026 - val_acc: 0.9805
Epoch 13/20
60000/60000 [==============================] - 23s 381us/step - loss: 0.0179 - acc: 0.9950 - val_loss: 0.1039 - val_acc: 0.9811
Epoch 14/20
60000/60000 [==============================] - 23s 377us/step - loss: 0.0155 - acc: 0.9956 - val_loss: 0.1173 - val_acc: 0.9809
Epoch 15/20
60000/60000 [==============================] - 22s 374us/step - loss: 0.0179 - acc: 0.9947 - val_loss: 0.1135 - val_acc: 0.9801
Epoch 16/20
60000/60000 [==============================] - 25s 415us/step - loss: 0.0126 - acc: 0.9965 - val_loss: 0.1391 - val_acc: 0.9792
Epoch 17/20
60000/60000 [==============================] - 24s 397us/step - loss: 0.0151 - acc: 0.9964 - val_loss: 0.1211 - val_acc: 0.9819
Epoch 18/20
60000/60000 [==============================] - 24s 400us/step - loss: 0.0159 - acc: 0.9962 - val_loss: 0.1208 - val_acc: 0.9800
Epoch 19/20
60000/60000 [==============================] - 24s 403us/step - loss: 0.0166 - acc: 0.9960 - val_loss: 0.1309 - val_acc: 0.9808
Epoch 20/20
60000/60000 [==============================] - 24s 397us/step - loss: 0.0133 - acc: 0.9965 - val_loss: 0.1310 - val_acc: 0.9813

In [ ]:

What is the accuracy of the model?¶

Plot the accuracy of the training model¶

In [17]:

#Look at the attributes in the history object to find the accuracy
history.__dict__

Out[17]:

{'epoch': [0,
  1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9,
  10,
  11,
  12,
  13,
  14,
  15,
  16,
  17,
  18,
  19],
 'history': {'acc': [0.9439666666666666,
   0.9757333333333333,
   0.9827833333333333,
   0.9864166666666667,
   0.9886666666666667,
   0.9910333333333333,
   0.9914,
   0.99245,
   0.9938166666666667,
   0.9939,
   0.99515,
   0.9948,
   0.9950166666666667,
   0.9956333333333334,
   0.9947333333333334,
   0.9965166666666667,
   0.9963833333333333,
   0.9962166666666666,
   0.99595,
   0.99645],
  'loss': [0.1827596818920225,
   0.08079896697839722,
   0.05645396511411139,
   0.04291815567353721,
   0.03526910000597515,
   0.02873079521368248,
   0.02715607473684601,
   0.023650182965393438,
   0.020055528101623546,
   0.02019607128013062,
   0.016955279541049723,
   0.017472221146037314,
   0.017864977817751575,
   0.015457480643335983,
   0.017869793417473495,
   0.012631182595215281,
   0.015135916414613901,
   0.015882995463786898,
   0.016569432756344288,
   0.013335457366452594],
  'val_acc': [0.9705,
   0.9743,
   0.9786,
   0.9774,
   0.9745,
   0.9782,
   0.9802,
   0.9771,
   0.98,
   0.9798,
   0.9783,
   0.9805,
   0.9811,
   0.9809,
   0.9801,
   0.9792,
   0.9819,
   0.98,
   0.9808,
   0.9813],
  'val_loss': [0.09286645495379343,
   0.08266143489209934,
   0.069480553943431,
   0.08320518101718044,
   0.09206652115154429,
   0.08190068501315655,
   0.08067529291427782,
   0.11358439496830543,
   0.10833151409866154,
   0.10160923933375093,
   0.11671308373045626,
   0.10255490619101375,
   0.10387474813488247,
   0.11728941089477675,
   0.11347036018394005,
   0.13906407877868832,
   0.12108565404413693,
   0.120797497302599,
   0.1309188434239974,
   0.13095201672244552]},
 'model': <keras.engine.sequential.Sequential at 0x7fb0320f3400>,
 'params': {'batch_size': 32,
  'do_validation': True,
  'epochs': 20,
  'metrics': ['loss', 'acc', 'val_loss', 'val_acc'],
  'samples': 60000,
  'steps': None,
  'verbose': 1},
 'validation_data': [array([[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]], dtype=float32),
  array([[0., 0., 0., ..., 1., 0., 0.],
         [0., 0., 1., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]], dtype=float32),
  array([1., 1., 1., ..., 1., 1., 1.], dtype=float32)]}

In [23]:

#Plot the accuracy 
plt.plot(history.history['acc'],label='train')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy')
plt.title('Model Accuracy Over Epoch')
plt.legend()

Out[23]:

<matplotlib.legend.Legend at 0x7fb013ca8dd8>

Plot the accuracy of training and validation set¶

In [24]:

#Plot the accuracy of training data and validation data
plt.plot(history.history['acc'],label='train')
plt.plot(history.history['val_acc'],label='val')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy')
plt.title('Model Accuracy Over Epoch')
plt.legend()

Out[24]:

<matplotlib.legend.Legend at 0x7fb013bfbfd0>

Accuracy of training and validation with loss¶

In [25]:

#Plot the accuracy of training data and validation data AND loss
plt.plot(history.history['acc'],label='train')
plt.plot(history.history['val_acc'],label='val')
plt.plot(history.history['loss'],label='loss')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy')
plt.title('Model Accuracy Over Epoch')
plt.legend()
# plt.yscale('log')

Out[25]:

<matplotlib.legend.Legend at 0x7fb013be43c8>

Evaluating model¶

In [21]:

score=model.evaluate(X_test, y_test)

10000/10000 [==============================] - 1s 84us/step

In [22]:

#We get score as a list
#The second item in score gives us the accuracy of or model
score

Out[22]:

[0.13095201672244552, 0.9813]