Understanding Stheno example

This example fits a GP and plots something. I am trying to understand all the objects here.

using Stheno
using Stheno.AbstractGPs
using CairoMakie


# Short length-scale and small variance.
const l1 = 0.4
const s1 = 0.2

# Long length-scale and larger variance.
const l2 = 5.0
const s2 = 1.0

# Specify a GaussianProcessProbabilisticProgramme object, which is itself a GP
# built from other GPs.
f = let
    gpc = Stheno.GPC()
    gp = kernel -> Stheno.wrap(GP(kernel), gpc)
    f1 = s1 * stretch(gp(Matern52Kernel()), 1 / l1)
    f2 = s2 * stretch(gp(SEKernel()), 1 / l2)
    f3 = f1 + f2
    Stheno.GPPP((; f1, f2, f3), gpc)
end

# Generate a sample from f3, one of the processes in f, at some random input locations.
# Add some iid observation noise, with zero-mean and variance 0.05.
const n = 1_000
const xx = collect(range(-5.0, 5.0; length=n))
const x = GPPPInput(:f3, xx);
const σ²_n = 0.05;
const fx = f(x, σ²_n);
const y = rand(n) + sin.(xx .* 2);




f_posterior = @time posterior(fx, y);

x_plot = range(-7.0, 7.0; length=1000);
xp = GPPPInput(:f3, x_plot);
ms = marginals(f_posterior(xp));

mea = mean.(ms)
st3 = 3std.(ms)



fig = Figure()
Axis(fig[1, 1])

scatter!(x.x, y; color=:red, markersize=2, strokewidth=0);

lines!(x_plot, mean.(ms), color = :blue, linewidth = 2)

band!(x_plot, mea .- st3, mea .+ st3, color = (:blue, 0.2))


for col in eachcol(rand(f_posterior(xp), 10))
    lines!(x_plot, col, color = (:blue, 0.3))
end

fig

What does this part do?

const x = GPPPInput(:f3, xx);
const σ²_n = .1;
const fx = f(x, σ²_n);

Especially fx = f(x, σ²_n);. What do the parameters of f mean?

The object f is a composite GP object, which captures the statistical relationship between f1, f2, and f3 defined in the let block. Basically, it’s three GPs which each have a certain correlation structure, with the constraint that one of them is the sum of the other two.

f is an abstract, infinite-dimensional object, but to use it in practice, you need to specify some locations x at which to calculate its mean and covariance. The last line is constructing a finite GP (i.e., a multivatiate normal distribution) defined at the locations x. In this line, σ²_n is the observation error associated with fx. Observations simulated from it will have an additional i.i.d. noise added to it with variance σ²_n. If you use those observations to to infer the true value of f, Stheno will take that measurement error into account (intuitively, it will tend to ignore random jumps in the data unless they’re significantly larger than σ²_n).

2 Likes