Bayesian inference in the presence of intractable integral

Hi guys, recently i’ve learned about the powerful packages for bayesian inference in julia
i would like some tips on how to attack this problem, given this probabiliy distribution over the angles \theta:

P(\vec \theta_1,\vec \theta_2|J,\vec h_1,\vec h_2) = \frac{W(\vec \theta_1,\vec \theta_2|J,\vec h_1,\vec h_2) }{Z(J,\vec h_1,\vec h_2)}

where W and Z are defined as

W(\vec \theta_1,\vec \theta_2|J,\vec h_1,\vec h_2) = \exp\big(\sum_{i,j=1}^{2} J_{i,j} \Sigma(\vec\theta_i) \cdot \Sigma(\vec\theta_j)+\sum_{i=1}^2 h_i \cdot \Sigma(\vec\theta_i)\big)
Z(J,\vec h_1,\vec h_2)=\int_{[0,2\pi]^{N_{\theta_1}+N_{\theta_2}}} d\theta_1 \, d\theta_2\,W(\vec \theta_1,\vec \theta_2|J,\vec h_1,\vec h_2)

finally the \Sigma_i are defined as

\Sigma(\theta)=\frac{1}{N_\theta}\sum_{j=1}^{N_\theta} \big(\cos(\vec \theta \cdot \hat e_j),\sin(\vec \theta \cdot \hat e_j)\big)

I have some observations of these sets of angles \vec \theta_1,\vec \theta_2 and i want to infer the posterior distribution of the matrix J and the vectors h_{1/2}. but i have no clue on how to leverage the marvelous tools in the julia ecosystem like Turing.jl etc… since the integral in Z will unavoidably appear in the logpdf.

if this question is considered off-topic i will delete it. Thank you in advance

What is N_\theta?

The length of the \theta vector, Sorry for the bad notation

IIUC, you can define your own distribution for theta = [theta1; theta2] parameterized by J, h1 and h2. Then you can define a Turing model as follows:

@model mymodel(theta) = begin
    J ~ J's prior
    h1 ~ h1's prior
    h2 ~ h2's prior
    theta ~ MyDist(J, h1, h2)
end

Here is how to define your own distribution and make it Turing-compatible https://turing.ml/dev/docs/using-turing/advanced#1-define-the-distribution-type.

The main part in your case is defining the logpdf function because the pdf functions seems involved.

That Is very nice, but…
Logpdf involves Z, which Is the intractable integral…

How large are N_{\theta_1} and N_{\theta_2}?

1 Like

Each one roughly \sim 500

I work a lot with models that have intractable normalizing constants in the likelihood (specifically, ERGMs). For that kind of problem, I don’t know that you have a choice except to write your own algorithm using e.g., ABC, auxiliary variable MCMC, things like the DMH sampler algorithm, etc. I don’t think there’s much anywhere (Julia or otherwise) that does “off-the-shelf” for such models. Well, Knet may have NCE (noise-contrastive estimation; it’s not clear to me if they actually have it, or if they had it a long time ago, and the docs are outdated). NCE works a lot like MCMC-MLE, and is good for problems with intractable normalizing constants.

3 Likes

How big are the eigenvalues of the J matrix? Maybe you can do a steepest descent approximation

Could you please give a reference paper or book for this method?

Wikipedia has some references: https://en.m.wikipedia.org/wiki/Method_of_steepest_descent

2 Likes

That Is the core of the inference problem, they can be 0 or very big, thanks anyway

I’m still a beginner at Bayesian inference, but have had good results using Approximate Bayesian Computation (ABC) for parameter inference for some mixture models with Markovian time dependencies that make the likelihood intractable.

I’ve been using the parallel implementation of ABC-SMC from https://github.com/marcjwilliams1/ApproxBayes.jl

I’ve been meaning to try out https://github.com/tanhevg/GpABC.jl but so far ApproxBayes.jl has been good enough.

2 Likes

@opera_malenky
thank you for the pointers, the DMH algorithm is very interesting, i will try to implement it, although it is very sad that such an algorithm is not available in the main general purpose packages that deal with bayesian computation.

@robsmith11
thank you for the suggestion on ApproxBayes.jl, that could be a winning approach if DHM turns out too slow.