NUTS speed is very slow for high dimension parameter inference in Turing.jl

Two things.

AD Backend First, I think the default backend for Turing is still forwarddiff, which should be very inefficient for a high-dimensional model like yours. I just tried your model on my laptop with reversediff as the AD backend and progressmeter shows 2:30:16,

Hierarchical Prior Second, InverseGamma(1,1) is too weak and should result in hard-to-navigate tails. This will result in NUTS choosing long integration trajectories, and the length of the trajectory is roughly proportional to the iteration complexity. The motivation behind people using the inverse gamma is simply because it is a conjugate prior to the normal, which is irrelevant to MCMC, so we’re free to use alternative priors. A better choice is to use a more informative prior with lighter tails like the truncated(Normal(0, 10), 0, Inf). This quickly reduced the projected sampling time to 1:20:45. See the prior choice wiki for an up-to-date recommendation list by the Stan people.

Forcing Short Trajectories The last measure would be to reduce the max tree depth as NUTS(0.65, max_depth=8). Combined with the informative prior above, progressmeter shows 0:40:18. The default parameter is 10 which results in a maximum of 2^10 leapfrog steps while 8 will result in 2^8. This measure, however, will negatively affect the statistical efficiency of the sampler, it’s recommended to instead fix the model so that the sampler does not hit the maximum limit. (But problems that are fundamentally hard to infer do exist, like sparse regression, stochastic volatility, etc… These are an open challenge to modern inference algorithms. So as an end-user, there is not much we can do about these…)

One of the difficulties with the current Bayesian workflow is that model design is not entirely independent from inference. It actually strongly affects the sampler’s performance both statistically and computationally. So you should tweak the model so that NUTS can do it’s job quickly and efficiently. Mike Betancourt’s blog have a lot of good guidelines on these aspects. See for example: Identity Crisis

7 Likes