How to integrate out latent variables/ intermediate values in Turing model

Hi all, my generative model has the following form.
P(y |\theta) = \int P(y|x) P(x|\theta) dx
I want to get the posterior of θ given fixed data y. But I unfortunately don’t have an analytical solution for the integral over x. As a result, I’m implementing the model in turing with the following structure and using sampling to get a posterior on θ

@model my_func()
   θ ~ prior_dist()
   x ~ dist1(θ)
   y ~ dist2(x)
end

However, this results in a really high number of dimensions for sampling, since all the x are also in the posterior. Is there a way to integrate out a latent variable in Turing to avoid this problem?

dist1 is a high dimensional multivariate dist?

Unless you have an analytic form there isn’t going to be an easy way to do high dim integration

Indeed dist1 is a high dimensional multivariate dist. In my case, it’s something along the lines of MvNormal(0, θ). However, there are some transformations in between in my actual model so I couldn’t integrate analytically.

Is it perhaps possible to marginalize out x by sampling x within each turing sampling step, assuming that the following is correct?
P(y|\theta) = \int P(y|x)P(x|\theta) dx = \frac{1}{N} \sum_{i=1}^{N} P(y|x_i)
x \sim dist1(x|\theta)

I think you’re trying to do something like:

theta ~ MyPrior()
x = rand(MvNormal(0,theta))
xprime = transform(x)
inclp = sum(logpdf(Ydist(xp),y) for xp in xprime)
@addlogprob!(inclp)

Which you can definitely try… see what you think.

1 Like

Thanks so much for this solution! I was just introduced to @addlogprob! the other day but didn’t realize you can use it like this.