Truncated Gaussian Process with AbstractGPs?

Hi all,

So I have started to work with gaussian process for a time-series model in my case, and I have a variable of interest “score”, which is always between 0 and 4, and varies over time.
Now I am using AbstractGPs and KernelFunctions and have something like this:

sekernel(alpha, rho) = alpha^2 * KernelFunctions.transform(SEKernel(), 1/(rho*sqrt(2)))

@model function score_model(score, tp, dog_ids, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    for (index, dog_id) in enumerate(dog_ids)
        f = GP(mean, sekernel(alpha, rho))
        score[index] ~ f(tp[index], sigma^2 + jitter)
    end
end

Now in my posterior prediction plots when I plot the credible interval around the mean prediction, I sometimes see the credible interval go below 0 or above 4. This is of course not desirable, because I know that that is not possible.
Normally I would use something like Truncated Normal to deal with this, but I cannot find a way to do this, because even Truncated(MvNormal) is not something that works in Julia.

So my 2 questions:

  • Is it possible to somehow specify by score should be between 0 and 4 in the model, using either AbstractGPs or something else?
  • If that is not possible, does it make sense to simply use min and max to truncate the predicted values after sampling? Because the latter does work, to simply truncate the sampled values, but I can’t figure out if this is a sensible approach and whether I don’t ‘lose’ information that way.

Thanks in advance!

1 Like

Hi @michielver,

GPs are by definition unbounded an values below 0 / above 4 are to be expected!
There are multiple ways to go after you want:

  • A simple one would be a transformation on f, for example you could use g(f) = 4 * logistic(f), the problem with this is that it does not put the weights equally between 0 and 4
  • Find alternative solutions like proposed in this presentation : https://www.gdr-mascotnum.fr/media/mascot12daveiga.pdf

Truncating the values after the sampling does not sound like a good idea, you would completely bias your results!

Hi @theogf

Thanks for pointing me in the right direction. I was indeed considering something like logistic function as well, but did not know in what way to apply it exactly. Is it correct the way I am doing it here? Because I could not find any other way to transform the GP directly before sampling it anyway, I even tried using Bijectors to see if that would maybe work.

using Turing
using KernelFunctions
using AbstractGPs
using StatsFuns

score = [3.0, 2.5, 1.9]
tp = [1, 15, 42]

sekernel(alpha, rho) = alpha^2 * KernelFunctions.transform(SEKernel(), 1/(rho*sqrt(2)))

@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)

    f = GP(2.0, sekernel(2.0, 20.0))

    g(f) = 4*logistic.(f)

    score ~ f(tp, sigma^2 + jitter)
    score = g(score)
end

model = score_model(score, tp)
chain = sample(model, NUTS(0.65), 100)

Additionally, am I correctly interpreting that in that presentation they essentially are talking about truncated multivariate normals? And specifically they are then introducing numerical approximations for truncated multivariate normals, or am I way off here?

Hi!
I think it’s almost correct. I am not the best with Turing but I think the following should be more correct:

g(f) = 4*logistic.(f)
@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    f ~ GP(2.0, sekernel(2.0, 20.0))(tp, sigma^2, jitter)
    score =  g(f)
end

For the presentation I did not check in details but I think they consider the very general case of h(x) > f(x) > g(x) where f is a GP and h and g are linear functions.

The score=g(f) line gives me trouble in your last case, it gives me a DimensionMismatch, whereas score=g(score) does work. Is there something inherently wrong with doing score=g(score)?

So then it would look like: (also, this made me realize that actually the GP definition could maybe be moved outside of the model as well, just like the g function, because it only depends on constants anyway, but Im not sure whether that matters much in terms of performance anyhow)

g(f) = 4*logistic.(f)
gp = GP(2.0, sekernel(2.0, 20.0))
@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    score ~ gp(tp, sigma^2, jitter)
    score = g(score)
end