In these examples, we show how you can create a small neural network. We have our input X, and our outputs Y, and we want to train a small network to map input to output (in this example, if the first and third column are 1 then the output is 1 - that is what the network will effectively learn).
More details on these examples at iamtrask blog: https://iamtrask.github.io/2015/07/12/basic-python-network/
import numpy as np
# input dataset
X = np.array([ [0,0,1],
[0,1,1],
[1,0,1],
[1,1,1] ])
# output dataset
y = np.array([[0,0,1,1]]).T
import numpy as np
# sigmoid function
def nonlin(x,deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
# seed random numbers to make calculation
# deterministic (just a good practice)
np.random.seed(1)
# initialize weights randomly with mean 0
syn0 = 2*np.random.random((3,1)) - 1
for iter in range(10000):
# forward propagation
l0 = X
l1 = nonlin(np.dot(l0,syn0))
# how much did we miss?
l1_error = y - l1
# multiply how much we missed by the
# slope of the sigmoid at the values in l1
l1_delta = l1_error * nonlin(l1,True)
# update weights
syn0 += np.dot(l0.T,l1_delta)
print ("Output After Training:")
print (l1)
Output After Training: [[0.00966449] [0.00786506] [0.99358898] [0.99211957]]
We want to learn the set of weights that syn0 should hold, such that the error between l1 (the thing being predicted), and Y (the values that are known) is minimised. The output matrix after training shows that we achieve very good results (the values 0.99 are effectively 1, and the others are effectively 0, which matches with our original output labels).
import numpy as np
def nonlin(x,deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
np.random.seed(1)
# randomly initialize our weights with mean 0
syn0 = 2*np.random.random((3,4)) - 1
syn1 = 2*np.random.random((4,1)) - 1
for j in range(60000):
# Feed forward through layers 0, 1, and 2
l0 = X
l1 = nonlin(np.dot(l0,syn0))
l2 = nonlin(np.dot(l1,syn1))
# how much did we miss the target value?
l2_error = y - l2
if (j% 10000) == 0:
print ("Error:" + str(np.mean(np.abs(l2_error))))
# in what direction is the target value?
# were we really sure? if so, don't change too much.
l2_delta = l2_error*nonlin(l2,deriv=True)
# how much did each l1 value contribute to the l2 error (according to the weights)?
l1_error = l2_delta.dot(syn1.T)
# in what direction is the target l1?
# were we really sure? if so, don't change too much.
l1_delta = l1_error * nonlin(l1,deriv=True)
syn1 += l1.T.dot(l2_delta)
syn0 += l0.T.dot(l1_delta)
Error:0.4685343254580603 Error:0.005002426725395313 Error:0.00345440546153305 Error:0.002786557019672355 Error:0.0023941155055209216 Error:0.0021288852682254146
In this next example, we have now extended to two layers within our network, so we have syn0 and syn1 as random weight matrices. Just as before, we update our weights based on the error at the output, but we then have to step backwards and update the first weight matrix based on the second matrix (so we work backwards with the error - known as backpropagation).