Many of the model classes in GPflow have overlapping behaviour in special cases. In this notebook, we fit some approximations to a model with a Gaussian likelihood, and make sure they're all the same.
The models are:
GPR
: Full Gaussian process regression.
VGP
: A Gaussian approximation with Variational Bayes.
Approximating a Gaussian posterior with a Gaussian should be exact.
SVGP
: a sparse GP, with a Gaussian approximation. The inducing points are set to be at the data points, so again, should be exact.
SVGP
(with whitened representation): As above, but with a rotation applied to whiten the representation of the process.
SGPR
: A sparse GP with a collapsed posterior (Titsias 2009). Again, the inducing points are fixed to the data points.
GPRFITC
: The FITC approximation. Again, the inducing points are fixed to the data points.
In all cases the parameters are estimated by the method of maximum likelihood (or approximate maximum likelihood, as appropriate). The parameter estimates should all be the same.
import gpflow
import numpy as np
import matplotlib
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = (12, 6)
plt = matplotlib.pyplot
np.random.seed(0)
X = np.random.rand(20,1)*10
Y = np.sin(X) + 0.9 * np.cos(X*1.6) + np.random.randn(*X.shape)* 0.4
Xtest = np.random.rand(10,1)*10
plt.plot(X, Y, 'kx', mew=2);
m1 = gpflow.models.GPR(X, Y, kern=gpflow.kernels.RBF(1))
m2 = gpflow.models.VGP(X, Y, gpflow.kernels.RBF(1), likelihood=gpflow.likelihoods.Gaussian())
m3 = gpflow.models.SVGP(X, Y, gpflow.kernels.RBF(1),
likelihood=gpflow.likelihoods.Gaussian(),
Z=X.copy(), q_diag=False)
m3.feature.set_trainable(False)
m4 = gpflow.models.SVGP(X, Y, gpflow.kernels.RBF(1),
likelihood=gpflow.likelihoods.Gaussian(),
Z=X.copy(), q_diag=False, whiten=True)
m4.feature.set_trainable(False)
m5 = gpflow.models.SGPR(X, Y, gpflow.kernels.RBF(1), Z=X.copy())
m5.feature.set_trainable(False)
m6 = gpflow.models.GPRFITC(X, Y, gpflow.kernels.RBF(1), Z=X.copy())
m6.feature.set_trainable(False)
models = [m1, m2, m3, m4, m5, m6]
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer.
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer.
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3528: setdiff1d (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2018-11-30. Instructions for updating: This op will be removed after the deprecation date. Please switch to tf.sets.difference().
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3528: setdiff1d (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2018-11-30. Instructions for updating: This op will be removed after the deprecation date. Please switch to tf.sets.difference().
Now, we optimize the models. For GPR
, SVGP
and GPRFITC
, this simply optimizes the hyperparameters (since the inducing points are fixed). For the variational models, this jointly maximises the lower bound to the marginal likelihood (Evidence Lower Bound, ELBO) with respect to the variational parameters and the kernel and likelihood hyperparameters.
_ = [gpflow.train.ScipyOptimizer(method='BFGS').minimize(m, maxiter=100) for m in models]
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead.
WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead.
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834361 Number of iterations: 7 Number of functions evaluations: 12
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834361 Number of iterations: 7 Number of functions evaluations: 12
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834361 Number of iterations: 52 Number of functions evaluations: 71
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834361 Number of iterations: 52 Number of functions evaluations: 71
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 52 Number of functions evaluations: 70
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 52 Number of functions evaluations: 70
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 52 Number of functions evaluations: 70
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 52 Number of functions evaluations: 70
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 7 Number of functions evaluations: 12
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834400 Number of iterations: 7 Number of functions evaluations: 12
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834355 Number of iterations: 7 Number of functions evaluations: 12
INFO:tensorflow:Optimization terminated with: Message: Optimization terminated successfully. Objective function value: 18.834355 Number of iterations: 7 Number of functions evaluations: 12
If everything worked as planned, the models should have the same:
For the variational models, where we use a ELBO in place of the likelihood, the ELBO should be tight to the likelihood in the cases studied here.
def plot(m, color, ax):
xx = np.linspace(-1, 11, 100)[:,None]
mu, var = m.predict_y(xx)
ax.plot(xx, mu, color, lw=2)
ax.fill_between(xx[:,0], mu[:,0] - 2*np.sqrt(var[:,0]), mu[:,0] + 2*np.sqrt(var[:,0]), color=color, alpha=0.2)
ax.plot(X, Y, 'kx', mew=2)
ax.set_xlim(-1, 11)
f, ax = plt.subplots(3,2,sharex=True, sharey=True, figsize=(12,9))
plot(m1, 'C0', ax[0,0])
plot(m2, 'C1', ax[1,0])
plot(m3, 'C2', ax[0,1])
plot(m4, 'C3', ax[1,1])
plot(m5, 'C4', ax[2,0])
plot(m6, 'C5', ax[2,1])
Here are the kernels and likelihoods, which show the fitted kernel parameters and noise variance:
for m in models:
print(m.__class__.__name__)
print(" kernel lengthscale = {:.5g}".format(m.kern.lengthscales.value))
print(" kernel variance = {:.5g}".format(m.kern.variance.value))
print(" likelihood variance = {:.5g}".format(m.likelihood.variance.value))
GPR kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002 VGP kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002 SVGP kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002 SVGP kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002 SGPR kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002 GPRFITC kernel lengthscale = 1.0774 kernel variance = 0.82561 likelihood variance = 0.16002
Here are the likelihoods (or ELBOs):
for m in models:
print("{:30} {}".format(m.__class__.__name__, m.compute_log_likelihood()))
GPR -18.834361360056654 VGP -18.834361360058967 SVGP -18.834399989069947 SVGP -18.834399989069947 SGPR -18.834399989061836 GPRFITC -18.83435478071552