ANN: DynamicHMC 2.0

This is kind of implicit, by LogDensityProblems works with unconstrained densities on \mathbb{R}^n, and you transform this into the constrained spaces. See eg

Sure, just open an issue please.

I agree, I don’t think that the distribution types in Distribution is the right abstraction for MCMC. Technically, a log density of course defines a distribution, but for I kind of think of those as having a closed form for moments, pdf and quantiles (when applicable), etc.

But I am not a big fan of building elaborate type hierarchies anyway unless I need them for something (usually dispatch).

1 Like

Yes, I am aware of this technique but last time I tried it it did not solve my problem (reparametrizing did). But this of course does not mean it cannot be useful.

Stepsize adaptation is still one of the trickiest issues of NUTS in practice. You may be interested in

(sorry if you have seen it already)

Done. :slight_smile:

1 Like

But I am not a big fan of building elaborate type hierarchies anyway

I agree that the trait-based approach in LogDensityProblems is neat, since it allows other packages to adopt it without affecting their type hierarchy.

However, if we had a somewhat offical/central package and API for this, I think it would be good to have an abstract type for densities (possibly even a supertype of Distribution). I do believe there would be many cases where if would be very helpful to be able to dispatch on it.

1 Like

Ah I get it - while ForwardDiff is an optional dependency, it’s kinda hard making BenchmarkTools a kind of second-order optional dependency? I wonder though, can @require be nested? I never tried that.

I do hope we’ll get some language support for optional dependencies (possibly via optional submodules) at some point. We had a BoF session about that in Baltimore, with a lot of core devs in the room, something might happen in that direction a some point.

I have two objections to this. The first is generic, that there is a cost to these things: people would assume that implementing the interface implies that something is a subtype, write functions like

function f(x::AbstractDensityType)
    ...
end

and then if x wants to be something else too at the same time (also just to simplify dispatch), this becomes impossible.

The second one is specific to log densities: they are mostly useful when we can query the space they are defined on (scalars, vectors, matrices, but can be arbitrary manifolds), but currently there is no good generic API for this.

In principle, that’s doable, I will consider it:

I do believe there would be many cases where if would be very helpful to be able to dispatch on it.

there is a cost to these things: people would assume that implementing the interface implies that something is a subtype, write functions like […]

Yes, but that’s the point of being able to dispatch on it, right? Same as for Distribution and many other types. While dispatching on traits is possible, it’s a lot more involved. You could make the same argument for pretty much everything else that has an abstract super-type - IMHO the type system approach is one of the things that has made things mesh so well in Julia, so why avoid having a type for densities?

and then if x wants to be something else too at the same time (also just to simplify dispatch), this becomes impossible.

I guess the question is, would a density be a density primarily, and something else secondarily, or the other way round? My assumption was the being a density would be the primary role.

The second one is specific to log densities: they are mostly useful when we can query the space they are defined on (scalars, vectors, matrices, but can be arbitrary manifolds), but currently there is no good generic API for this.

True - but to make such an API, eventually, it could still be very helpful for densities to have an abstract type.

I think I am cautious about exposing abstract types in public APIs, especially where the user may be expected to define new types to represent problems, precisely because I don’t want to make this decision for the user.

You are of course right that traits are a bit more involved, especially if you want to dispatch on them (but note that you don’t need to do this always, eg for the LogDensityProblems API you can just query and check things without dispatching on traits). So I think it is worth it.

The bottom line is that I consider abstract type hierarchies mostly an implementation detail, which is ideally not exposed. This of course may not be the majority opinion among Julia programmers,YMMV.

I don’t know where this belief comes from. AdvancedHMC is a major rewrite and has simply been renamed from the NUTS package Kai worked on in 2016 to AdvancedHMC. According to the GitHub history, DynamicHMC is much younger (2017) and copied codes by Kai, which is legit and we did the same with MCMCChains. But please don’t state misguiding facts. Nobody is helped with that.

Thanks for the information, I was not aware of this NUTS package (in fact, I still could not find it).

Incidentally, I did not copy any code from either package — DynamicHMC is written from scratch (not even the original Hoffman & Gelman code was used, I wrote the tree building in functional style). Nevertheless, I am quite meticulous about attributing sources, so I am curious what gave you this idea — if possible, please link the relevant code here.

In turn, I am aware of AdvancedHMC.jl adapting some code from DynamicHMC.jl (eg #51 (see #17)) but of course I am perfectly fine with this (even somewhat flattered), this is how open source works.

In addition, a lot of the design in packages solving the same problem is convergent evolution.

I agree. Please try to follow your own advice, too.

2 Likes

It has been the same repo all along. Kai simply moved it at some point from his GitHub profile to TuringLang.

Apologies if I falsely claimed something. I remember we looked into your repo last year as we linked it to Turing and your NUTS implementation was exactly the same as the one done by Kai in 2016, including variable names and so on. But that might have simply been by accident.

1 Like

And btw. congratulations to your release 2.0!

1 Like

I think you must have misunderstood something. I looked at the old code for AdvancedHMC, and variable names between the two packages were different from the very beginning, since I follow Betancourt’s “Conceptual Introduction” paper, with q, p instead of \theta, r, etc. Most NUTS implementations follow one or the other though, since they are written based on the same papers.

The main difference I see between DynamicHMC and other implementations is the pervasive functional style (no mutable state) in the former. The motivation for this is that I find unit testing very difficult with a lot of mutable state (effectively, the internals of DynamicHMC are now an exercise in unit testing that is gone too far :wink:). I am not aware of other libraries, either in Julia or other languages, that implement NUTS this way (but I did not investigate).

In general I think the Julia ecosystem is at a stage when experimenting with parallel approaches is fine, even without a particular reason. In case anyone is curious about the motivation for DynamicHMC: I was working with Mamba.jl, and then had trouble coding up a custom likelihood. At that time, AFAIK there were no registered/stable packages that made this possible, and I was assuming that writing up my own NUTS implementation would be the easiest solution.

I find it nice that a lot of Julia package are very modular with very cooperative developers. I think that the most challenging part of NUTS & friends is having an efficient way to obtain gradients, for which we have a lot of useful approaches in Julia so I could just build on that in DynamicHMC.

Given AD, the logic of NUTS is the relatively easy part. The other hard part is diagnostics, on the level of ShinyStan.

5 Likes

Is there any benchmark of DynamicHMC against Stan, Inla, AdvancedHMC, Turing…?

https://github.com/StatisticalRethinkingJulia/MCMCBenchmarks.jl

But didn’t mean a package to do the benchmark but the results obtained.

The benchmarks you are looking for can be found in the docs.

3 Likes