Hello everyone.
I am writing this post with the hope that It helps people with background in physics, and want to learn about probabilistic programming using Turing.jl. My background is geophysics, and I only have a couple of weeks trying to learn how to use Turing.jl. I’ll share my advance so far, with the hope that the people with more experience can leave comments (performance tips, conceptual mistakes, etc), and the geophysics community can use as a guide to start probabilistic programming with Julia, and learn from my experience so far. Unfortunately, there is little information online about using Turing.jl fro geophysical applications so I’ve been more or less guessing the last couple of weeks.
I am solving a 1D-MT problem: I have a profile of resistivities in the subsurface (a vector), and a forward operator MT_forward that predicts field observations from this model, in the classical notation:
data= MT_forward(model).
The probabilistic model I built is the following:
@model function mt_inference(data,n_layers,λ)
#Prior distribution for the possible values of resistivities
rho ~ filldist(Uniform(0,2000),n_layers)
#Calculate the expected value (using the MT forward model)
exp_data = MT_forward(rho,n_layers)
#Likelihood: as a function of the L2 norm of the model, and the L2 norm of the misfit
l2_norm_misfit = sqrt(sum((data - exp_data).^2))
l2_norm_model = sqrt(sum(rho).^2)
J= (l2_norm_misfit) .+ (λ*l2_norm_model)
for i in eachindex(rhoref)
data[i] ~ Normal(J,10)
end
end;
Then this model is sampled using:
chain = sample(model, Turing.NUTS(), 100);
As as mentioned before, I just have a couple of weeks using Turing.jl, and a couple of months researching Bayesian inference, so I might have conceptual errors, but my interpretation of the above model is:
I have an uninformative prior on the possible resistivity values that my layers can get (an uniform distribution from 0 to 2000 ohm.m). This uniform distribution is repeated for the n-layers of my model.
As I understand the forward model can take any form of valid julia code (some comments about this later).
The likelihood will be maximum, for those models that minimize the misfit with the observed data, and that have minimum norm (with a regularization term λ).
What I’ve found so far:
-
NUTS is the best sampler (so far): I tried with MH, SMC, HMC, HMCDA, PG, and the only sampler that gets close to the expected results is NUTS. I actually think that there is some problem with my model, because I have no other explanation for the poor results that I get with all the other samplers (There are several cases in the literature of successful MT probabilistic inversions using MH).
-
The relationship between the number of samples and the computer time is not linear: I’ve been drawing 100 samples from my chains, which I get in around 80s. However increasing the amount of samples to 300, makes the running time extremely large.
-
Changing the AD backend: I changed the AD backend to :reversediff and it started giving problems with my forward operator. The code would crash in basic functions such as sqrt (which I had to change to .^0.5) and the hyperbolic tangent (which I had to redefine using exponentials). After making these adjustments in my forward operator, I could not spot any major difference when compared with :forwarddiff.
-
Tricks with the multivariate distributions in priors and likelihood: I’ve found mixed comments about how to define the multivariate distributions (either with a loop, filldist, lazyarrays, etc) and I still can’t decide what is the best. In terms of results apparently I am getting similar results, however using:
data ~ arraydist(LazyArray(@~Normal.(J,10)))
fails, which I wasn’t expecting because I’ve seen it being used in other models, although with Bernoulli distributions.
So far the code is running, and I am getting the expected results. However, I find the code quite slow (the fastest result is 80 seconds, inferring only 40 values). I know that It is probably my fault, so I’m open to run test suggestions, run diagnostics, and test any change that any of you guys may suggest.
Thanks!