Digit Classification with Neural Networks¶

Import packages¶

In [1]:

from keras.datasets import mnist
from keras.preprocessing.image import load_img, array_to_img
from keras.utils.np_utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.

Load the data¶

In [2]:

(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step

In [3]:

print(type(X_train))
print(X_train.shape)
print(y_train.shape) #60k is the answers
print(X_test.shape)  #10K entries
print(y_test.shape)

<class 'numpy.ndarray'>
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)

Understanding the image data format¶

In [4]:

#Lets look at the data to see what it looks like
print(X_train[0].shape) #Look at the size of the 1st entry

#Plot it to see what it looks like

plt.imshow(X_train[0])

#Print the answer
print("The answer is {}".format(y_train[0]))

(28, 28)
The answer is 5

Preprocessing the image data¶

In [0]:

image_height, image_width =28, 28

In [0]:

#Lets reshape each image to be a single vector rather than a matrix

#Have to flatten to plug into neural net

X_train  =X_train.reshape(60000,image_height*image_width)

X_test   =X_test.reshape(10000,image_height*image_width)

In [7]:

print(X_train.shape) #28X28 =784
print(X_test.shape)

(60000, 784)
(10000, 784)

In [8]:

#Check to see if image is between 0-255
print(min(X_train[0]), max(X_train[0])) #it is! so we need to normalize

#We will convert data to float (insead of int) to scale the data betwn 0-1 (not 0-255)

X_train = X_train.astype('float32') #Convert to float
X_test  = X_test.astype('float32') #Convert to float

0 255

In [9]:

#Normalize the data
X_train /= 255.0
X_test  /= 255.0
print(min(X_train[0]), max(X_train[0])) #Normalized

0.0 1.0

In [10]:

# We want the output to be in one of 9 bins to rep each of the 0-9 numbers
#In order to do this we can convert the answers to a categorical value
#We do this using the 'to_categorical' method

y_train =to_categorical(y_train, 10)
y_test  =to_categorical(y_test, 10)
print(y_train.shape)
print(y_test.shape)

(60000, 10)
(10000, 10)

In [11]:

print(y_test[0])
plt.imshow(X_test[0].reshape(image_height, image_width))

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]

Out[11]:

<matplotlib.image.AxesImage at 0x7f83c951fe80>

Build a model¶

In [12]:

#Assign the model type
model = Sequential()

WARNING: Logging before flag parsing goes to stderr.
W0820 07:05:57.344842 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

In [13]:

#Add layers to the model

model.add(Dense(512, activation='relu',input_shape=(784,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))

W0820 07:05:57.396619 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0820 07:05:57.411686 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

Compile the model¶

In [14]:

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

W0820 07:05:57.470818 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0820 07:05:57.507090 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.

In [15]:

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________

Calculating the number of parameters for each layer¶

Layer 1¶

After flattening each image we get:
- 28 X 28=784
We then pass the 784 into 512 nodes in the model plus a bias layer 512 (zeros)
This gives:
- 784(pixels) X 512(neurons) X 512(bias)=401920

Layer 2¶

We have 512 (output from previous), going into another 512 nodes (in new layer), plus another 512
This gives:
- 512 (input) X 512 (this layer) X 512 =262656

Layer 3¶

We have 512 (incoming from last layer), going into 10 nodes (in this layer), 10 bias units
This gives:
- 512 (last layer) X 10 (nodes in this layer) + 10 (bias) =5130

Train the model¶

Now we can trian our model. To do this we have to pass:

Training data
Number of epochs (the number of times that model passes through the training data)
Validation data (testing data)

In [16]:

history =model.fit(X_train, y_train, epochs =20, validation_data=(X_test, y_test))

W0820 07:05:57.687465 140205005252480 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0820 07:05:57.748347 140205005252480 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 27s 452us/step - loss: 0.1842 - acc: 0.9437 - val_loss: 0.1321 - val_acc: 0.9579
Epoch 2/20
60000/60000 [==============================] - 26s 437us/step - loss: 0.0789 - acc: 0.9759 - val_loss: 0.0777 - val_acc: 0.9754
Epoch 3/20
60000/60000 [==============================] - 26s 432us/step - loss: 0.0563 - acc: 0.9823 - val_loss: 0.0869 - val_acc: 0.9733
Epoch 4/20
60000/60000 [==============================] - 26s 426us/step - loss: 0.0431 - acc: 0.9864 - val_loss: 0.0694 - val_acc: 0.9800
Epoch 5/20
60000/60000 [==============================] - 26s 433us/step - loss: 0.0346 - acc: 0.9891 - val_loss: 0.0767 - val_acc: 0.9817
Epoch 6/20
60000/60000 [==============================] - 26s 435us/step - loss: 0.0301 - acc: 0.9905 - val_loss: 0.0891 - val_acc: 0.9782
Epoch 7/20
60000/60000 [==============================] - 26s 427us/step - loss: 0.0251 - acc: 0.9923 - val_loss: 0.1008 - val_acc: 0.9777
Epoch 8/20
60000/60000 [==============================] - 26s 435us/step - loss: 0.0205 - acc: 0.9935 - val_loss: 0.1029 - val_acc: 0.9775
Epoch 9/20
60000/60000 [==============================] - 26s 432us/step - loss: 0.0183 - acc: 0.9943 - val_loss: 0.0857 - val_acc: 0.9805
Epoch 10/20
60000/60000 [==============================] - 27s 443us/step - loss: 0.0205 - acc: 0.9940 - val_loss: 0.1044 - val_acc: 0.9815
Epoch 11/20
60000/60000 [==============================] - 26s 436us/step - loss: 0.0190 - acc: 0.9942 - val_loss: 0.1140 - val_acc: 0.9777
Epoch 12/20
60000/60000 [==============================] - 26s 429us/step - loss: 0.0146 - acc: 0.9958 - val_loss: 0.1239 - val_acc: 0.9775
Epoch 13/20
60000/60000 [==============================] - 26s 427us/step - loss: 0.0152 - acc: 0.9956 - val_loss: 0.1150 - val_acc: 0.9823
Epoch 14/20
60000/60000 [==============================] - 26s 427us/step - loss: 0.0149 - acc: 0.9957 - val_loss: 0.1235 - val_acc: 0.9800
Epoch 15/20
60000/60000 [==============================] - 25s 423us/step - loss: 0.0154 - acc: 0.9960 - val_loss: 0.1280 - val_acc: 0.9797
Epoch 16/20
60000/60000 [==============================] - 25s 422us/step - loss: 0.0166 - acc: 0.9956 - val_loss: 0.1625 - val_acc: 0.9742
Epoch 17/20
60000/60000 [==============================] - 25s 424us/step - loss: 0.0140 - acc: 0.9965 - val_loss: 0.1035 - val_acc: 0.9820
Epoch 18/20
60000/60000 [==============================] - 25s 422us/step - loss: 0.0153 - acc: 0.9960 - val_loss: 0.1205 - val_acc: 0.9829
Epoch 19/20
60000/60000 [==============================] - 25s 422us/step - loss: 0.0111 - acc: 0.9969 - val_loss: 0.1324 - val_acc: 0.9804
Epoch 20/20
60000/60000 [==============================] - 25s 420us/step - loss: 0.0166 - acc: 0.9962 - val_loss: 0.1513 - val_acc: 0.9795

In [0]:

What is the accuracy of the model?¶

Plot the accuracy of the training model¶

In [17]:

#Look at the attributes in the history object to find the accuracy
history.__dict__

Out[17]:

{'epoch': [0,
  1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9,
  10,
  11,
  12,
  13,
  14,
  15,
  16,
  17,
  18,
  19],
 'history': {'acc': [0.9437166666666666,
   0.9758666666666667,
   0.9822833333333333,
   0.9864166666666667,
   0.9891166666666666,
   0.99045,
   0.9922833333333333,
   0.9935166666666667,
   0.9943,
   0.99405,
   0.9942,
   0.9957833333333334,
   0.9956,
   0.99565,
   0.9959666666666667,
   0.9955666666666667,
   0.99645,
   0.9960166666666667,
   0.99685,
   0.99625],
  'loss': [0.18416011471686264,
   0.07894161813653384,
   0.05629563206637589,
   0.043141744812547886,
   0.03463797800432561,
   0.03007295670452998,
   0.025087888938408045,
   0.020477740043793727,
   0.018332584290340098,
   0.020489056526854378,
   0.01904268060001924,
   0.014636038859433074,
   0.015241379422380274,
   0.01488577041779337,
   0.015401399811703709,
   0.01655236050960512,
   0.014017093596740634,
   0.015295848346811408,
   0.011142703662544658,
   0.01661629691273954],
  'val_acc': [0.9579,
   0.9754,
   0.9733,
   0.98,
   0.9817,
   0.9782,
   0.9777,
   0.9775,
   0.9805,
   0.9815,
   0.9777,
   0.9775,
   0.9823,
   0.98,
   0.9797,
   0.9742,
   0.982,
   0.9829,
   0.9804,
   0.9795],
  'val_loss': [0.13213747869767248,
   0.07768956533807796,
   0.08690066614362876,
   0.06944087889036164,
   0.07673393609686209,
   0.08913333164230862,
   0.10080510411080691,
   0.1028824891035541,
   0.08566066905374536,
   0.10436492597102856,
   0.1140227648591278,
   0.12387610763142443,
   0.1149812871933566,
   0.12350996273625792,
   0.12803592754282145,
   0.1625020817862077,
   0.10353397753429631,
   0.12051795932296623,
   0.13241675687355067,
   0.15128104574999485]},
 'model': <keras.engine.sequential.Sequential at 0x7f83cbd42d30>,
 'params': {'batch_size': 32,
  'do_validation': True,
  'epochs': 20,
  'metrics': ['loss', 'acc', 'val_loss', 'val_acc'],
  'samples': 60000,
  'steps': None,
  'verbose': 1},
 'validation_data': [array([[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]], dtype=float32),
  array([[0., 0., 0., ..., 1., 0., 0.],
         [0., 0., 1., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]], dtype=float32),
  array([1., 1., 1., ..., 1., 1., 1.], dtype=float32)]}

In [18]:

#Plot the accuracy 
plt.plot(history.history['acc'],label='train')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy (%)')
plt.title('Model Accuracy Over Epoch')
plt.legend()

Out[18]:

<matplotlib.legend.Legend at 0x7f83c95288d0>

Plot the accuracy of training and validation set¶

In [19]:

#Plot the accuracy of training data and validation data
plt.plot(history.history['acc'],label='train')
plt.plot(history.history['val_acc'],label='val')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy (%)')
plt.title('Model Accuracy Over Epoch')
plt.legend()

Out[19]:

<matplotlib.legend.Legend at 0x7f83e1eaf320>

Accuracy of training and validation with loss¶

In [20]:

#Plot the accuracy of training data and validation data AND loss
plt.plot(history.history['acc'],label='train')
plt.plot(history.history['val_acc'],label='val')
plt.plot(history.history['loss'],label='loss')
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy (10*%)')
plt.title('Model Accuracy Over Epoch')
plt.legend()
# plt.yscale('log')

Out[20]:

<matplotlib.legend.Legend at 0x7f83c94ccac8>

Evaluating model¶

In [21]:

score=model.evaluate(X_test, y_test)

10000/10000 [==============================] - 1s 73us/step

In [22]:

#We get score as a list
#The second item in score gives us the accuracy of or model
score

Out[22]:

[0.15128104574999485, 0.9795]