Hello all,
I am a student researching possible extensions of HMC-type samplers for settings in which posteriors exhibit high correlation between parameters.
I have a working implementation of a code for a toy problem in Julia using the AdvancedHMC.jl package. The results (posterior traceplots, convergence diagnostics, etc.) are textbook – could not ask for better.
My issue: I have been attempting to replicate the same behavior in R. I am implementing both forward and adjoint sensitivity approaches to help obtain gradients w.r.t. parameters of the system, along with trying to implement NUTS, dual-averaging, and mass matrix adaptation (beyond diagonal). Nothing I do gives results as efficient as Julia’s AdvancedHMC.jl or Stan in R. Julia having access to AD methods obviously means the wall-clock time is better.
This issue is not ‘surprising’ – per se – but really dampens the exploration of any ideas in research because a suboptimal NUTS sampler means that any convergence diagnostics from testing an idea shroud any potential efficacy.
My Question: Is there any way to implement your own HMCKernel instead of
ϵ = find_good_stepsize(hamiltonian, collect(Float64, initial_θ))
integrator = JitteredLeapfrog(ϵ, 0.1)
kernel = HMCKernel(Trajectory{MultinomialTS}(integrator, GeneralisedNoUTurn()))
adaptor = StanHMCAdaptor(MassMatrixAdaptor(metric), StepSizeAdaptor(0.8, integrator))
and still be able to take advantage of, say, the StanHMCAdaptor and the like?