But what is a Hessian, really?
#collapse
# imports
# using Plots; plotlyjs()
using PlotlyJS
# from IPython.display import HTML
# HTML(fig.to_html()) # where fig = plotly.plot(...)
Unable to load WebIO. Please make sure WebIO works for your Jupyter client. For troubleshooting, please see the WebIO/IJulia documentation.
The Hessian matrix appears in the optimization literature, but the intuition for how the Hessian and its inverse transform vectors is opaque to me. Let's review second order partial derivatives, and then try to build intuition for the Hessian matrix.
For the purpose of this intuition-building exercise, we'll work with functions $\Reals^2 \mapsto \Reals^1$. I'll also use partial derivative notations $\frac{\partial}{\partial y} f(x, y) = \frac{\partial f}{\partial y} = f_y$ interchangeably.
Take the $\Reals^2 \mapsto \Reals^1$ function $f(x, y) = x^2 + 2y^2$.
A partial derivative is the change in an "output" variable (in our case, $f$) with respect to infinitesimal changes in an "input" variable (in our case, $x$ or $y$). For example, $\frac{\partial}{\partial y} f(x, y) = 4y$, which is to say, for any point in the domain, moving infinitsimally in the y direction changes f propotional to 4 times the y coordinate of the starting point point.
f(x, y) = x^2 + 2y^2
x = 6
xlim=[-10, x]
ylim=[-10, 10]
xs = LinRange(xlim..., 101)
ys = LinRange(ylim..., 101)
zs = [f(x, y) for x in xs, y in ys]
y = 4
dy = 4
f_y(y) = 4y
f_y (generic function with 1 method)
#collapse
# built interactive plot
traces = GenericTrace[]
push!(traces, PlotlyJS.surface(x=xs, y=ys, z=zs,
showscale=false, opacity=0.8))
push!(traces, PlotlyJS.surface(x=[x, x+0.001], y=ylim, z=[[maximum(zs), minimum(zs)] [maximum(zs), minimum(zs)]],
showscale=false, colorscale="Greys", opacity=0.2))
push!(traces, PlotlyJS.scatter3d(x=fill(x, size(ys)), y=ys, z=[x^2 + 2y^2 for y in ys],
showlegend=false, mode="lines", line=attr(color="red", width=2)))
for y in ys[1:5:end]
push!(traces, PlotlyJS.scatter3d(x=fill(x, 2),y=[y-dy, y+dy], z=[f(x,y)-f_y(y)*dy, f(x,y)+f_y(y)*dy],
visible=false, showlegend=false, mode="lines", line=attr(color="orange", width=5)))
end
scene = attr(
xaxis = attr(range=[-10,10]),
yaxis = attr(range=[-10,10]),
zaxis = attr(range=[-50,300]),
aspectratio = attr(x=1, y=1, z=1)
)
layout = Layout(
sliders=[attr(
steps=[
attr(
label=round(y, digits=2),
method="update",
args=[attr(visible=[fill(true, 3); fill(false, i-1); true; fill(false, 101-i)])]
)
for (i, y) in enumerate(ys[1:5:end])
],
active = y,
currentvalue_prefix="x = 6, y = ",
# pad_t=40
)],
scene = scene,
)
p = PlotlyJS.plot(traces, layout)
We can plot the function $f_y$ for every starting point:
#collapse
# plot partial derivative of f with respect to y, f_y
traces = GenericTrace[]
push!(traces, PlotlyJS.surface(x=xs, y=ys, z=zs,
showscale=false, opacity=0.8))
push!(traces, PlotlyJS.surface(x=ylim, y=ylim, z=[[0, 0] [0, 0]],
showscale=false, colorscale="Greys", opacity=0.3))
push!(traces, PlotlyJS.surface(x=xs, y=ys, z=[f_y(y) for x in xs, y in ys],
showscale=false))
plot(traces, Layout(scene=scene))