ANN: DynamicHMC 2.0

Tamas_Papp · September 3, 2019, 7:09am

DynamicHMC 2.0 was just released. I wrote a blog post about the changes and the new features.

There are breaking API changes, please read the documentation.

Feel free to ask support questions here.

falbarelli · September 4, 2019, 7:04pm

@Tamas_Papp thanks a lot for the updated version of your package!
I am a newbie in this field of MCMC for Bayesian estimation… I hope you don’t mind if I use this thread for a bit of help in understanding why my code is not working.

I have quite a complicated loglikelihood function, written in Julia, that can be correctly evaluated.
However, after following the steps in your worked example (transforming the parameters and setting up the gradient) I get an error when I run mcmc_with_warmup():

ArgumentError: nothing should not be printed; use show, repr, or custom output instead.

(This is the full stacktrace)

I wouldn’t be surprised if the problem was in the automatic differentiation, since the loglikelihood is quite an involved function, but unfortunately I’m not able to understand this error message…

jkbest2 · September 4, 2019, 7:17pm

Is there a print statement somewhere in your log likelihood function?

falbarelli · September 4, 2019, 7:32pm

@jkbest2 there are no `print’ statements in the loglikelihood… From what I understand by looking at the stacktrace it seems something related to internal functions of the library

Yifan_Liu · September 5, 2019, 1:02am

My knowledge in Bayesian is really rusty and I plan to pick it up in the near future. My old textbooks all use BUGS, I recently bought the Statistical Rethinking book but it uses R. The Julian codes for this book in github seem still in progress, many chapters are missing.

Do you have any recommended readings for using your DynamicHMC package?

Tamas_Papp · September 5, 2019, 7:06am

I am happy to help you with this, but I need an MWE.

Tamas_Papp · September 5, 2019, 7:08am

Yes, there is a bibliography in the README. I would recommend starting with Bayesian Data Analysis. But Statistical Rethinking should also be fine — coding everything is part of the learning process.

Note that learning even the basics of Bayesian analysis is not a trivial undertaking, and can easily be equivalent to a full semester course.

falbarelli · September 5, 2019, 7:45am

Thanks, today I will try to strip down the loglikelihood function to the minimum that still produces the problem!

Tamas_Papp · September 5, 2019, 8:00am

Thanks. It would be best if you opened an issue on Github, with version information etc.

BTW, are you sure you are using 2.0.0 or master? There was an issue like this which was fixed by this line. Your line numbers in the stacktrace don’t match up with the current version, but that could be an unrelated thing.

falbarelli · September 5, 2019, 12:29pm

Hey thanks a lot, I was not using the last version I think.
I have now updated to Julia 1.2 and I got the packages from your github repository and now everything works.
Incidentally for some reason I still get the same problem if I run the code in Jupyter notebook instead of in a script…

Yifan_Liu · September 5, 2019, 2:08pm

Thanks. I will read these materials.

oschulz · September 8, 2019, 12:06pm

To my knowledge, we currently have two high-quality HMC implementations: DynamicHMC.jl and AdvancedHMC.jl. I haven’t used either intensely (but I definitely plan to), so I’m curious as to what the main differences between the two packages are.

Tamas_Papp · September 8, 2019, 3:18pm

This is a good question. AFAIK DynamicHMC.jl predates AdvancedHMC.jl, so it would be best to ask the authors of the latter about what, if anything, they were missing that made them start their own package.

oschulz · September 8, 2019, 3:21pm

@Kai_Xu, could you provide some information in this regard? I heard good things about both DynamicHMC.jl and AdvancedHMC.jl, and I guess they probably have their individual strengths. From a user perspective, it would be great so have some pointers on which might fit what kind of use cases better.

Tamas_Papp · September 8, 2019, 3:46pm

It occurred to me that @goedman (hope you don’t mind me pinging) coded up everything in

https://github.com/StatisticalRethinkingJulia

for all available options, so perhaps he is the best person to ask about a comparison.

Kai_Xu · September 8, 2019, 8:02pm

Thanks for the ping @oschulz. Here are some of my thoughts.

What’s AdvancedHMC.jl (AHMC) for

AHMC is motivated for two purposes.

Serve as the HMC backend of the Turing probabilistic programming language.
- It extracted the HMC codes from Turing.jl written during my MPhil in 2016 and becomes a standalone package on its own.
Serve as a research platform for developing novel variants of HMC algorithm
- We made some efforts so that we can quickly develop new HMC algorithms using AHMC. You might see some novel samplers available in AHMC shortly.

Note that because of these two purposes, the direct use of AHMC requires some basic understanding of HMC algorithms, and might not be suitable for users of all levels. But you can achieve more stuff with it because of its flexibility and GPU support. E.g. you can sample from an energy-based model defined by a neural network using Flux.jl running on GPUs via CuArrays.jl.

See our poster on StanCon 2019 for more details of AHMC in which we also statistically compare our no-U-turn sampler (NUTS) implementation to Stan’s based on MCMCBenchmarks.jl.

The differences between AHMC and DynamicHMC.jl (DHMC)

The main difference is AHMC aims to support a wide range of HMC variants, including both static and dynmiac HMC algorithms, while DHMC, as the name indicates, implments a specific variant of dynamic HMC algorithm: NUTS with mutinomial sampling and generalised no-u-turn criterion; AHMC also supports this variant. Again our poster illustrates the variants that AHMC currently supports.

The Turing language and the differences in modelling pipeline

For end-users who focus on modelling, I highly recommend them to use Turing directly. The default HMC samplers are based on AHMC, but you can choose to use the NUTS implementation by DHMC as well. Plus there are many other inference methods available in Turing.jl, e.g. importance sampling, SMC, particle Gibbs, variational inference, etc. You will need some of those if you are working with discrete data

PS: A compositional interface that combines different MCMC samplers in a Gibbs algorithm is also available (see our AISTATS paper).

Turing has its modelling language (quite similar to Stan’s) which automatically uses Bijectors.jl (which is another library extracted from Turing.jl) to deal with constraint variables. The use of Turing.jl and Bijectors.jl (roughly) corresponds to the use of LogDensityProblems.jl and TransformVariables.jl.

For analysis, Turing provides MCMCChains.jl that implements a unified and easy way to check a lot of statistics for MCMC as well as plotting useful figures. This is also the default MCMC summary backend of CmdStan.jl.

Tamas_Papp · September 9, 2019, 4:56am

The only difference I see is AdvancedHMC supporting the slice sampler and the turning criterion of the original NUTS, but since both have been superseded (by the multinomial sampler and the generalized turning criterion), I don’t think this is relevant for most users; as these are less efficient variants, not something that you would ever want to go back to.

oschulz · September 9, 2019, 6:50am

Thanks @Tamas_Papp and @Kai_Xu!

oschulz · September 9, 2019, 6:54am

Regarding Turing, we’re planning to integrate DynamicHMC and/or AdvancedHMC into BAT.jl, it’s focused on use cases with (often complex) user-written likelihoods. So I’ll need to use the lower level interfaces, and I’m very happy that we have two high-quality (as far as I can judge) standalone HMC packages now that can be used as backends (without pulling in large frameworks that are geared towards specific use cases).

Tamas_Papp · September 9, 2019, 8:01am

I started DynamicHMC.jl specifically for supporting user-coded likelihoods, because the packages that were around that time preferred to expose a higher-level framework with directed acyclic graphs, which are not really suitable for the models I am working with. I find that being able to debug, unit test, benchmark, and optimize my log posterior as a plain vanilla Julia function is really valuable.

If you are using (modern) NUTS, I would say that the sampler should be close to identical between AdvancedHMC.jl or DynamicHMC.jl, so perhaps you should just review the interface and use whatever you find convenient.

For DynamicHMC.jl, the API for defining a log density is documented here in detail:

https://tamaspapp.eu/LogDensityProblems.jl/dev/

You can use this interface to perform AD for you with any of the supported AD packages (and it is easy to change later).

If you want to discuss something or just need a new feature, just open an issue or ping me on this forum.

Topic		Replies	Views
Custom likelihoods in Turing.jl General Usage	15	3714	October 26, 2018
MCMC landscape Statistics question	99	12959	July 2, 2021
DynamicHMC API rewrite Statistics question	1	732	July 25, 2019
A community convention for target function return values General Usage statistics , optimization , machine-learning	12	1169	July 1, 2020
DynamicHMC: multivariate normal loglikelihood? General Usage	10	952	May 24, 2019

ANN: DynamicHMC 2.0

What’s AdvancedHMC.jl (AHMC) for

The differences between AHMC and DynamicHMC.jl (DHMC)

The Turing language and the differences in modelling pipeline

Related topics