Can a robot do something it wasn't trained to do? Conventional wisdom says that neural networks work fine when interpolating between their training instances. However, they can't extrapolate. That is, neural networks cannot appropriately perform when placed in a region outside their trained experience. This notebook explores that notion.
In this experiment, we create a robot with a single sonar sensor facing forward. The robot starts at the end of a hallway, and approaches a target location halfway down a long hallway.
from conx import *
from jyro.simulator import *
import math
import time
Using Theano backend. conx, version 3.5.6
We make an 8m x 2m room, with the robot at the far right end facing to the left. The sonar sensor faces forward and has a max range of 6m. We give the robot a camera just so that we can see the simulated world from the robot's perspective.
def make_world(physics):
physics.addBox(0, 0, 8, 2, fill="backgroundgreen", wallcolor="gray")
MAX_SENSOR_DISTANCE = 6 # meters
def make_robot():
robot = Pioneer("Pioneer", 7.5, 1, math.pi/2) #parameters are x, y, heading (in radians)
robot.addDevice(PioneerFrontSonar(MAX_SENSOR_DISTANCE))
robot.addDevice(Camera())
return robot
robot = make_robot()
robot
We write a function that provides a teacher to drive the robot. The closer it gets to the target location, the slower the robot moves. The function returns a list of (sensor, power) pairs where the sensor is a scaled distance reading of the sensor, and power is how fast the robot should move given the sensor reading.
def collect_data(simulator):
data = []
simulator.reset() # put robot back to where it is defined
while True:
scaled_dist = simulator.robot["sonar"].getData()[0]/MAX_SENSOR_DISTANCE
# The power is a function of distance:
power = 1.0 - ((1 - scaled_dist)/0.33 * 0.9)
robot.move(power, 0)
data.append([scaled_dist, [power]])
simulator.step()
time.sleep(.1) # don't overwhelm the network
if power < 0.05:
break
return data
We create a visual simulation in order to watch what the robot does.
sim = VSimulator(robot, make_world, size=(700, 180))
Failed to display Jupyter Widget of type VBox
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
Now, we collect the data:
data = collect_data(sim)
We note that the simulator ran for 8.10 seconds, and if it collects a pair of (sensor, power) for every 1/10 of a second, it should have 81 pairs:
len(data)
81
We visualize the collected data:
train = ["Training Data", [pair[1][0] for pair in data]]
plot(train,
title="Speed as a Function of Distance",
xlabel="distance from target",
ylabel="speed",
xs=[pair[0] for pair in data], default_symbol="o")
Now, we will use the data to train a neural network. The network will have 1-unit in the input layer (for the scaled distance reading), a hidden layer, and a 1-unit output layer to produce the power value.
net = Network("Go To Target")
net.add(Layer("input", 1))
net.add(Layer("hidden", 10, activation="sigmoid"))
net.add(Layer("output", 1, activation = "linear"))
net.connect()
net.compile(loss="mse", optimizer=SGD(lr=.1, momentum=.5))
net.model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input (InputLayer) (None, 1) 0 _________________________________________________________________ hidden (Dense) (None, 10) 20 _________________________________________________________________ output (Dense) (None, 1) 11 ================================================================= Total params: 31 Trainable params: 31 Non-trainable params: 0 _________________________________________________________________
We load the collected data into the network:
net.dataset.load(data)
net.dataset.summary()
Input Summary: count : 81 (81 for training, 0 for testing) shape : [()] range : (0.6511592, 1.0) Target Summary: count : 81 (81 for training, 0 for testing) shape : [(1,)] range : (0.048616093, 1.0)
net.dashboard()
Failed to display Jupyter Widget of type Dashboard
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
And now, we train the network on the given data:
if net.saved():
net.load()
net.plot_results()
else:
net.train(400, accuracy=1.0, tolerance=0.05, batch_size=1, save=True, plot=True)
This is a fairly straightforward problem for the network to solve. If it doesn't get 100% accuracy, you may wish to:
net.delete()
net.reset()
and try training again by running the previous cell.
test = ["Network", [net.propagate(pair[0])[0] for pair in data]]
plot([train, test],
title="Speed as a Function of Distance",
xlabel="distance from target",
ylabel="speed",
default_symbol="o",
xs=[pair[0] for pair in data])
def net_brain(robot):
scaled_distance = robot["sonar"].getData()[0]/6
output = net.propagate([scaled_distance])[0]
robot.move(output, 0)
outputs.append([scaled_distance, output])
history.append(robot.getPose())
robot.brain = net_brain
outputs = []
history = []
sim.reset()
sim.display()
Failed to display Jupyter Widget of type VBox
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
Before continuing, run the experiment above:
trained_range = ["Network interpolation", outputs]
scatter(trained_range,
title="Network Generalization",
xlabel="input", ylabel="output", default_symbol="o")
len(history)
89
Let's make a little movie of the trained experience:
def replay_history(index):
pose = history[index]
robot.setPose(*pose)
sim.update()
return sim.canvas.render(format="pil")
movie(replay_history, "generalize-in-range.gif", (0, len(history)))
Now, the big question: what does the robot do in positions where t wasn't trained? How does it extrapolate?
To test this, we put the robot in a novel location, and re-run the experiment. (You can skip this next cell, and manually put the robot in any position you wish).
robot.setPose(.5, 1)
sim.update_gui()
First, we reset the variables:
outputs = []
history = []
sim.display()
Failed to display Jupyter Widget of type VBox
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
Take a look at the simulation above. What will the robot do? What is a reasonable action to make?
Let's find out what the robot does.
Before continuing, run the experiment above:
movie(replay_history, "generalize-out-range.gif", (0, len(history)))
untrained_range = ["Network extrapolation", outputs]
scatter([trained_range, untrained_range],
title="Network Generalization",
xlabel="input", ylabel="output", default_symbol="o")
The robot was tested in a region for which there was no training data, and so it had to extrapolate from what it knew, to a novel scenario. It did. Although it had never been trained to move backwards, it was able to determine that the correct action was to move backwards. In fact, it shows that the further the robot is from the target location, the faster it moves towards it.
How was the robot able to extrapolate beyond its training? Well, the network's ability to move backwards when it was never trained to do so is a bit of an overstatement of what the network is doing. Because the output is a single unit for "power" and he robot can be driven by a single value with negative meaning to backup and positive meaning to move forward, it really doesn't make a distinction between forward/backward... it merely converts the distance into a value.
However, the network has done what has been claimed to be impossible, albeit in a simple fashion. What this suggests is that extrapolation to unknown regions may be possible, if the space of actions is arranged in a proper way.