In [ ]:

class Sigmoid(Node):
    def __init__(self, x):
        Node.__init__(self, [x])

    def _sigmoid(self, x):
        return 1.0 / (1.0 + np.exp(-x))

    def forward(self):
        self.value = self._sigmoid(self.inbound_nodes[0].value)

    # derivative of sigmoid(x) is (1 - sigmoid(x)) * sigmoid(x)
    def backward(self):
        self.gradients = {n: np.zeros_like(n.value) for n in self.inbound_nodes}
        for n in self.outbound_nodes:
            grad = n.gradients[self]
            self.gradients[self.inbound_nodes[0]] += (1 - self.value) * self.value * grad

Forward Pass¶

The sigmoid function is

$$ f(x) = \frac {1} {1 + e^{-x}} $$

We implement that in the _sigmoid function and use it to perform the forward pass.

Backward Pass¶

$$ \frac {\partial f} {\partial x} = (-1) \frac {1} {(1+e^{-x})^2} (-1) e^{-x} = \frac {e^{-x}} {(1+e^{-x})^2} $$

You'll notice in the code we use (1 - self.value) * self.value as the final derivative. This part is not intuitive at all! We add and subtract 1 from the numerator, which is the same as adding 0. The funny thing is, with this small trick we can simplify the expression even further!

$$ \frac {e^{-x}} {(1+e^{-x})^2} = \frac {1} {1+e^{-x}} \frac {e^{-x}} {1+e^{-x}} = \frac {1} {1+e^{-x}} \frac {e^{-x} + 1 - 1} {1+e^{-x}} $$$$ \frac {e^{-x} + 1 - 1} {1+e^{-x}} = \frac {1 + e^{-x}} {1 + e^{-x}} - \frac {1} {1 + e^{-x}} = 1 - \frac {1} {1 + e^{-x}} $$

So the final result is

$$ \frac {1} {1 + e^{-x}} (1 - \frac {1} {1 + e^{-x}}) = f(x) (1 - f(x)) $$

Sigmoid Solution¶

Forward Pass¶

Backward Pass¶