I saw some of the recent posts these last few weeks about implementing gaussian processes in Turing. I’ve really been wanting to play around with GP-LVMs for a while (in particular, as part of a larger model), so I took a stab at trying to implement the basics of one using those recently-posted examples.
The issue is that the model is extremely slow. I’ve worked with GPs enough (including GPs with latent inputs) on my own to have realistic expectations, but the speed for even a very small model was much slower than I would have expected.
I don’t know that I expect a fix or anything, but I thought I would this post in case anyone is curious about this use case, perhaps for future testing/benchmarking. The code below isn’t a fully proper GP-LVM (we’d probably want to use the ARD transform from KernelFunctions for that), but does the basic operations involved, afaik.
With only 50 data points (on three dimensions), it took about 100 minutes to run using NUTS. Is NUTS just a bad fit for this kind of model?
using Turing, KernelFunctions, ReverseDiff, LinearAlgebra, BenchmarkTools
Turing.setadbackend(:reversediff)
sekernel(alpha, rho) = alpha^2 * KernelFunctions.transform(SEKernel(), sqrt(0.5)/rho)
@model function GPLVM(y1, y2, y3, jitter=1e-6)
N, = size(y1)
P = 3
# Priors
X ~ filldist(Normal(0, 1), N)
alpha ~ LogNormal(0, 0.1)
rho ~ filldist(LogNormal(0, 0.1), P)
sig2 ~ filldist(LogNormal(0, 1), P)
# GP Covariance matrix
kernel1 = sekernel(alpha, rho[1]) # covariance function
kernel2 = sekernel(alpha, rho[2]) # covariance function
kernel3 = sekernel(alpha, rho[3]) # covariance function
K1 = kernelmatrix(kernel1, X) # cov matrix
K2 = kernelmatrix(kernel2, X) # cov matrix
K3 = kernelmatrix(kernel3, X) # cov matrix
K1 += I * (sig2[1] + jitter)
K2 += I * (sig2[2] + jitter)
K3 += I * (sig2[3] + jitter)
# Sampling Distribution.
y1 ~ MvNormal(zeros(N), K1)
y2 ~ MvNormal(zeros(N), K2)
y3 ~ MvNormal(zeros(N), K3)
end
y = randn(30, 3)
@time gp = GPLVM(y[:,1], y[:,2], y[:,3])
@time chain = sample(gp, NUTS(0.65), 2000)
While I would like to be able to use GP-LVMs as part of larger models in Turing, if anyone has suggestions for fitting them in general (e.g., alternative packages), I’d certainly be interested. I can do them by hand, of course, but I’m looking for things to make it easier to use and write less of my own code.