I am able to draw some random functions from the prior distribution, but when I try to condition on the data it seems to make no difference. The curves in the right plot should all intersect the blue scatter points but they don’t. Any suggestions what might be wrong?
and used it when passing v to MvNorm (somehow you have removed it in your code).
In any case, there can be two ways why a matrix is not positive definite:
Some eigenvalue can be below zero. This is fixed by adding a small positive perturbation to the diagonal. Most code on GPs I have seen, e.g., in Python or Stan, uses something like Σ + 1e-8*I.
The matrix is not Hermitian as in your error. In case of real matrices A, this simply means that the matrix is not symmetric and can be fixed by using \frac{1}{2}(A + A^T) instead, i.e., as in my version of ensure_posdef above. Note that the code by @tchebycheff also uses that trick in the function mvnorm.
While the first problem always needs to fixed numerically, the deeper reason for the second one is that matrix inversion is numerically rather unstable and should be avoided. When you continue reading in the book, algorithm 2.1 will fix this issue by using the Cholesky decomposition instead. Here is how your function would read if using this algorithm:
make_ftest(Xtest, Xtrain, ftrain) = let
# KXteXtr: not needed as same as KXtrXte'
KXtrXtr = [kernel(xtr, xtr′) for xtr in eachcol(Xtrain), xtr′ in eachcol(Xtrain)]
KXteXte = [kernel(xte, xte′) for xte in eachcol(Xtest), xte′ in eachcol(Xtest)]
KXtrXte = [kernel(xtr, xte) for xtr in eachcol(Xtrain), xte in eachcol(Xtest)]
L = cholesky(KXtrXtr).L
alpha = L' \ (L \ ftrain)
m = KXtrXte' * alpha
v = L \ KXtrXte
MvNormal(m, KXteXte - v' * v + 1e-8*I) # still need to fix issue 1.
end
I tried implementing it on a more complex kernel and data. When I ask the model for extrapolation slightly outside the range of the observed X values it gives Y values that are very far outside the observed range. I plot both the data alone and the GP fit to show this below. I’m guessing this is some kind of numerical issue but I don’t know how to diagnose.
using CairoMakie, StatsBase, Statistics, SplitApplyCombine, LinearAlgebra, KernelFunctions, Distributions, Tau
"Ensure the matrix is positive definite."
ensure_posdef(a) = let
a = .5(a + a')
# Add a small diagonal.
d = 3eps(opnorm(a))I
res = a + d
isposdef(res) || error("failed to make positive definite")
res
end
((kernel(x::T,y::T)::T) where T) = let
k = sum([
ConstantKernel()
with_lengthscale(PeriodicKernel(), 365.25)
with_lengthscale(Matern52Kernel(), 30)
])
k(x,y)
end
#=
> Invent some training data points
=#
Xtrain = 736329.0:737790.0
ftrain = (20.0( sintau.(Xtrain / 365.25) + randn(length(Xtrain))) ) .+ 176.0
Xtest = 735964.0:738155.0
make_ftest(Xtest, Xtrain, ftrain) = let
KXteXtr = product(kernel, Xtest, Xtrain)
KXtrXtr = product(kernel, Xtrain, Xtrain)
KXteXte = product(kernel, Xtest, Xtest)
KXtrXte = product(kernel, Xtrain, Xtest)
# Algorithm 2.1
L = cholesky(KXtrXtr).L
α = transpose(L) \ (L \ ftrain)
m = KXteXtr * α
v = (L \ KXtrXte)
V = KXteXte - v'v
MvNormal(m, ensure_posdef(V))
end
ftest = make_ftest(Xtest, Xtrain, ftrain)
fig = Figure()
ax = Axis(fig[1,1]; xminorgridvisible=true,yminorgridvisible=true)
band!(ax,
vec(Xtest),
mean(ftest) .- 2std.(var(ftest)),
mean(ftest) .+ 2std.(var(ftest));
color=(:gray, .25),
)
foreach(1:3) do _
r = rand(ftest)
lines!(ax, vec(Xtest), r)
end
scatter!(ax, Vector(vec(Xtrain')), ftrain; color=(:steelblue, .8), markersize=3)
display(fig)
That seems correct. Note that you fit without observation noise, i.e., the GP posterior has to pass through all your (rather random looking data) exactly. If you allow for observation noise – change the line KXtrXtr = product(kernel, Xtrain, Xtrain) .+ 1.0 * I(length(Xtrain)) in make_ftest – the result looks much better.