class Sigmoid(Node):
def __init__(self, x):
Node.__init__(self, [x])
def _sigmoid(self, x):
return 1.0 / (1.0 + np.exp(-x))
def forward(self):
self.value = self._sigmoid(self.inbound_nodes[0].value)
# derivative of sigmoid(x) is (1 - sigmoid(x)) * sigmoid(x)
def backward(self):
self.gradients = {n: np.zeros_like(n.value) for n in self.inbound_nodes}
for n in self.outbound_nodes:
grad = n.gradients[self]
self.gradients[self.inbound_nodes[0]] += (1 - self.value) * self.value * grad
The sigmoid function is
$$ f(x) = \frac {1} {1 + e^{-x}} $$We implement that in the _sigmoid
function and use it to perform the forward pass.
You'll notice in the code we use (1 - self.value) * self.value
as the final derivative. This part is not intuitive at all! We add and subtract 1 from the numerator, which is the same as adding 0. The funny thing is, with this small trick we can simplify the expression even further!
So the final result is
$$ \frac {1} {1 + e^{-x}} (1 - \frac {1} {1 + e^{-x}}) = f(x) (1 - f(x)) $$