This notebook explores the two-spirals category task. This is a good example of how to make a problem difficult for humans and neural networks.
import conx as cx
import math
Using TensorFlow backend. ConX, version 3.7.4
This task involves separating two categories, A and B, where the two sets spiral around each other.
First, let's make the dataset:
def spiral_xy(i, spiral_num):
"""
Create the data for a spiral.
Arguments:
i runs from 0 to 96
spiral_num is 1 or -1
"""
φ = i/16 * math.pi
r = 6.5 * ((104 - i)/104)
x = (r * math.cos(φ) * spiral_num)/13 + 0.5
y = (r * math.sin(φ) * spiral_num)/13 + 0.5
return (x, y)
def spiral(spiral_num):
return [spiral_xy(i, spiral_num) for i in range(97)]
a = ["A", spiral(1)]
b = ["B", spiral(-1)]
cx.scatter([a,b])
So, there it is: given the (x,y) coordinates of a point, can you determine if it belongs to category A or B. This is fairly easy to do given the picture. But very difficult only given the coordinates.
Nonetheless, this was an early challenge problem for neural networks, and much research was done in order to learn the task.
Many things were tried, with various levels of success!
For an overview of the task, and solutions see, for example:
https://www.researchgate.net/publication/220233514_Variations_of_the_two-spiral_task
Here is an attempt using so-called "shortcut" connections:
net = cx.Network("Two-Spirals")
net.add(
cx.Layer("input", 2),
cx.Layer("hidden1", 5, activation="sigmoid"),
cx.Layer("hidden2", 5, activation="sigmoid"),
cx.Layer("hidden3", 5, activation="sigmoid"),
cx.Layer("output", 2, activation="softmax")
)
net.connect("input", "hidden1")
net.connect("input", "hidden2")
net.connect("input", "hidden3")
net.connect("input", "output")
net.connect("hidden1", "hidden2")
net.connect("hidden1", "hidden3")
net.connect("hidden1", "output")
net.connect("hidden2", "hidden3")
net.connect("hidden2", "output")
net.connect("hidden3", "output")
net.build_model()
net.summary()
__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input (InputLayer) (None, 2) 0 __________________________________________________________________________________________________ hidden1 (Dense) (None, 5) 15 input[0][0] __________________________________________________________________________________________________ concatenate_1 (Concatenate) (None, 7) 0 input[0][0] hidden1[0][0] __________________________________________________________________________________________________ hidden2 (Dense) (None, 5) 40 concatenate_1[0][0] __________________________________________________________________________________________________ concatenate_2 (Concatenate) (None, 12) 0 input[0][0] hidden1[0][0] hidden2[0][0] __________________________________________________________________________________________________ hidden3 (Dense) (None, 5) 65 concatenate_2[0][0] __________________________________________________________________________________________________ concatenate_3 (Concatenate) (None, 17) 0 input[0][0] hidden1[0][0] hidden2[0][0] hidden3[0][0] __________________________________________________________________________________________________ output (Dense) (None, 2) 36 concatenate_3[0][0] ================================================================================================== Total params: 156 Trainable params: 156 Non-trainable params: 0 __________________________________________________________________________________________________
net.dashboard()
Dashboard(children=(Accordion(children=(HBox(children=(VBox(children=(Select(description='Dataset:', index=1, …
net.dataset.load([(xy, [1, 0]) for xy in spiral(1)] +
[(xy, [0, 1]) for xy in spiral(-1)])
def schedule(start, end, num_steps):
step = (end - start) / (num_steps - 1)
current = start
values = []
for i in range(num_steps):
values.append(current)
current += step
return values
schedule(0.001, 0.002, 10)
[0.001, 0.0011111111111111111, 0.0012222222222222222, 0.0013333333333333333, 0.0014444444444444444, 0.0015555555555555555, 0.0016666666666666666, 0.0017777777777777776, 0.0018888888888888887, 0.002]
schedule(0.5, 0.95, 10)
[0.5, 0.55, 0.6000000000000001, 0.6500000000000001, 0.7000000000000002, 0.7500000000000002, 0.8000000000000003, 0.8500000000000003, 0.9000000000000004, 0.9500000000000004]
net.dataset.split(0)
net.reset()
for lr, m in zip(schedule(0.001, 0.002, 10),
schedule(0.5, 0.95, 10)):
net.compile(error="categorical_crossentropy", optimizer='sgd', lr=lr, momentum=m)
net.train(100, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4, verbose=0)
net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4)
Interrupted! Cleaning up... ======================================================== | Training | Training | Validate | Validate Epochs | Error | Accuracy | Error | Accuracy ------ | --------- | --------- | --------- | --------- # 1510 | 0.69089 | 0.50515 | 0.68765 | 0.56186
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) <ipython-input-81-c04797f683f1> in <module>() ----> 1 net.train(20000, report_rate=10, batch_size=16, accuracy=1.0, tolerance=0.4) ~/.local/lib/python3.6/site-packages/conx/network.py in train(self, epochs, accuracy, error, batch_size, report_rate, verbose, kverbose, shuffle, tolerance, class_weight, sample_weight, use_validation_to_stop, plot, record, callbacks, save) 1468 print("Saved!") 1469 if interrupted: -> 1470 raise KeyboardInterrupt 1471 if verbose == 0: 1472 return (self.epoch_count, self.history[-1]) KeyboardInterrupt:
net.plot_activation_map()
However, I could never learn to do the task. Perhaps you can find some parameters that will work.
Or, perhaps we can just make this much easier for the neural network.
In this formulation, we create "images" for each input and use a Convolutional layer.
import conx as cx
import copy
We need to pick a resolution for the images. We chop up the input space into a 50 x 50 images.
RESOLUTION = 50
def make_picture(res):
matrix = [[0.0 for i in range(res)]
for j in range(res)]
for x,y in spiral(1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
matrix[1 - y][x] = 0.5
for x,y in spiral(-1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
matrix[1 - y][x] = 0.5
return matrix
matrix = make_picture(RESOLUTION)
cx.array_to_image(matrix, shape=(RESOLUTION,RESOLUTION,1)).resize((400,400))
In this example, we have three values:
We could might be able to leave out the other data, but that seemed to make it more difficult. We want to let the network "see" the pattern.
def make_data(res):
data = []
for x,y in spiral(1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
data.append([inputs, [0, 1]])
for x,y in spiral(-1):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
data.append([inputs, [1, 0]])
return data
data = make_data(RESOLUTION)
We create the simplest form of a Conv2DLayer network:
net = cx.Network("Two-Spirals using Pictures")
net.add(
cx.ImageLayer("input", (RESOLUTION, RESOLUTION), 1),
cx.Conv2DLayer("conv2d", 2, 4),
cx.FlattenLayer("flatten"),
cx.Layer("output", 2, activation="softmax")
)
net.connect()
net.compile(error="categorical_crossentropy", optimizer="rmsprop")
net.dataset.load(data)
net.dashboard()
Dashboard(children=(Accordion(children=(HBox(children=(VBox(children=(Select(description='Dataset:', index=1, …
And try training it:
net.reset()
net.train(1000, accuracy=1.0, report_rate=10)
No training required: accuracy already to desired value Training dataset status: | Training | Training Epochs | Error | Accuracy ------ | --------- | --------- # 386 | 0.24732 | 1.00000
It worked! This makes the task easy.
Let's take a look at the generalization capability of the network by creating images that it wasn't trained on over the 50 x 50 space:
def test0(x, y, res=RESOLUTION):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
return net.propagate(inputs)[0]
def test1(x, y, res=RESOLUTION):
x = min(int(round(x * res)), res - 1)
y = min(int(round(y * res)), res - 1)
inputs = copy.deepcopy(matrix)
inputs[1 - y][x] = 1.0
inputs = cx.reshape(inputs,(50,50,1))
return net.propagate(inputs)[1]
cx.view([cx.heatmap(test0, format="image"), cx.heatmap(test1, format="image")],
labels=["output[0]","output[1]"], scale=7.0)
Sometimes it creates fairly smooth spirals. However, other times, it may just "memorize" the problem, with no particular pattern. Which do you get?