# How to understand GP extrapolation shape

``````using CairoMakie
using KernelFunctions
using AbstractGPs

n = 30
x_train  = rand(n) .+ 1
y_train = log.(x_train)

f = GP(Matern52Kernel())

fx = f(x_train, 0.1)

p_fx = posterior(fx, y_train)

fig = Figure()
Axis(fig[1,1])

xplot = collect(minimum(x_train):0.001:maximum(x_train))
ms = marginals(p_fx(xplot))
means = mean.(ms)

lines!(xplot, means, linewidth=10)
scatter!(x_train, y_train;color=:red)

xplot2 = collect(1:.001:5)
ms2 = marginals(p_fx(xplot2))
means2 = mean.(ms2)

lines!(xplot2, means2)
fig
``````

I am extrapolating this Matern kernel GP outside the range of the data. I want to understand why the extrapolation looks the way it does.

• Why does it go back down after x=2?
• Why does it asymptote at y=0?

By not passing an explicit mean to `GP` like you did in:

``````f = GP(Matern52Kernel())
``````

youâ€™re declaring â€śzero-meanâ€ť Gaussian process prior. See this line. Since your GP prior assumes zero-mean, the GP will converge towards 0 where there are not enough data points.

2 Likes

For any mean zero process it will eventually asymptote at zero, but if you wanted the extrapolation to â€ślastâ€ť a bit longer before smoothly heading to zero, you might consider using a covariance function corresponding to a long memory process, which I think is most cleanly defined as a covariance function whose corresponding spectral density has a (integrable) singularity at the origin. Fractional ARIMA processes are an example, and I bet there is at least one time series package that does ARFIMA models.

2 Likes

Extrapolation in GPs is a sophisticated matter that need some special treatments. One usually need to use covariance kernels that are able to extrapolate as pointed out by @cgeoga . See for example this paper and this paper by Andrew Wilson which discusses a combination of kernel that works well for extrapolation in a lot of cases. Additional resources can be found in this webpage.

1 Like