Truncated Gaussian Process with AbstractGPs?

michielver · January 22, 2021, 12:50pm

Hi all,

So I have started to work with gaussian process for a time-series model in my case, and I have a variable of interest “score”, which is always between 0 and 4, and varies over time.
Now I am using AbstractGPs and KernelFunctions and have something like this:

sekernel(alpha, rho) = alpha^2 * KernelFunctions.transform(SEKernel(), 1/(rho*sqrt(2)))

@model function score_model(score, tp, dog_ids, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    for (index, dog_id) in enumerate(dog_ids)
        f = GP(mean, sekernel(alpha, rho))
        score[index] ~ f(tp[index], sigma^2 + jitter)
    end
end

Now in my posterior prediction plots when I plot the credible interval around the mean prediction, I sometimes see the credible interval go below 0 or above 4. This is of course not desirable, because I know that that is not possible.
Normally I would use something like Truncated Normal to deal with this, but I cannot find a way to do this, because even Truncated(MvNormal) is not something that works in Julia.

So my 2 questions:

Is it possible to somehow specify by score should be between 0 and 4 in the model, using either AbstractGPs or something else?
If that is not possible, does it make sense to simply use min and max to truncate the predicted values after sampling? Because the latter does work, to simply truncate the sampled values, but I can’t figure out if this is a sensible approach and whether I don’t ‘lose’ information that way.

Thanks in advance!

theogf · January 22, 2021, 12:59pm

Hi @michielver,

GPs are by definition unbounded an values below 0 / above 4 are to be expected!
There are multiple ways to go after you want:

A simple one would be a transformation on f, for example you could use g(f) = 4 * logistic(f), the problem with this is that it does not put the weights equally between 0 and 4
Find alternative solutions like proposed in this presentation : https://www.gdr-mascotnum.fr/media/mascot12daveiga.pdf

Truncating the values after the sampling does not sound like a good idea, you would completely bias your results!

michielver · January 23, 2021, 10:46am

Hi @theogf

Thanks for pointing me in the right direction. I was indeed considering something like logistic function as well, but did not know in what way to apply it exactly. Is it correct the way I am doing it here? Because I could not find any other way to transform the GP directly before sampling it anyway, I even tried using Bijectors to see if that would maybe work.

using Turing
using KernelFunctions
using AbstractGPs
using StatsFuns

score = [3.0, 2.5, 1.9]
tp = [1, 15, 42]

sekernel(alpha, rho) = alpha^2 * KernelFunctions.transform(SEKernel(), 1/(rho*sqrt(2)))

@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)

    f = GP(2.0, sekernel(2.0, 20.0))

    g(f) = 4*logistic.(f)

    score ~ f(tp, sigma^2 + jitter)
    score = g(score)
end

model = score_model(score, tp)
chain = sample(model, NUTS(0.65), 100)

Additionally, am I correctly interpreting that in that presentation they essentially are talking about truncated multivariate normals? And specifically they are then introducing numerical approximations for truncated multivariate normals, or am I way off here?

theogf · January 26, 2021, 2:37pm

Hi!
I think it’s almost correct. I am not the best with Turing but I think the following should be more correct:

g(f) = 4*logistic.(f)
@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    f ~ GP(2.0, sekernel(2.0, 20.0))(tp, sigma^2, jitter)
    score =  g(f)
end

For the presentation I did not check in details but I think they consider the very general case of h(x) > f(x) > g(x) where f is a GP and h and g are linear functions.

michielver · January 27, 2021, 1:05pm

The score=g(f) line gives me trouble in your last case, it gives me a DimensionMismatch, whereas score=g(score) does work. Is there something inherently wrong with doing score=g(score)?

So then it would look like: (also, this made me realize that actually the GP definition could maybe be moved outside of the model as well, just like the g function, because it only depends on constants anyway, but Im not sure whether that matters much in terms of performance anyhow)

g(f) = 4*logistic.(f)
gp = GP(2.0, sekernel(2.0, 20.0))
@model function score_model(score, tp, jitter=1e-6 )
    sigma ~ LogNormal(0.0, 1.0)
    score ~ gp(tp, sigma^2, jitter)
    score = g(score)
end

Topic		Replies	Views
Truncate a distribution using draws of a random variable as bound Statistics turing , distributions	1	72	September 2, 2024
Gaussian Processes with missing data Probabilistic Programming turing , gaussian-process	11	2153	February 8, 2021
Truncated distributions in Gen Probabilistic Programming question	5	936	June 23, 2023
Turing: selective truncated distribution error Probabilistic Programming turing , distributions , autodiff	1	144	September 12, 2024
Turing: truncated regression example Probabilistic Programming	1	1558	January 23, 2021

Truncated Gaussian Process with AbstractGPs?

Related topics