NLSolversBase is perhaps the one that is used most widely… with [13 direct dependents] […] Personally I find the interface a bit too convoluted
I fully agree with you, that’s why I wanted to go for something very simple.
As for buffers - I guess that’s not so popular at the moment anyhow because Zygote doesn’t support array modification? The gradient may become part of a larger computation after all, so that AD would still come into play.
Making a common API for objective functions would require that we find some common ground though
I think your and Chris’ idea about having access functions - esp. three for value, gradient and aux data, like you were also planning - would make for a nice common ground. It’s actually very similar to what I recently started to implement in BAT.jl.
My suggestion to use NamedTuples (I think the access functions should still support them) was motivated by the fact that we wouldn’t even need a central package. But Chris is right, functions should be at the center.
And of course you’re right, it would only work if package authors are willing to adopt it.
at this stage of the ecosystem’s development I am not entirely sure this is a reasonable goal.
Well, don’t fault me for trying.
authors of Turing.jl wrote AdvancedHMC.jl as a backend for use in Turing.jl, while DynamicHMC.jl wants users to code their own log posteriors
Well, we had this discussion a while ago (ANN: DynamicHMC 2.0 - #14 by oschulz) - from what I understood, AdvancedHMC is definitely also meant to be used standalone.
It works fine that way, we actually added AdvancedHMC as a sampling backend to BAT.jl recently (BAT is about providing comfort for users that bring a likelihood, so very different from Turing), and I was thinking about adding DynamicHMC too. I can’t speak for the Turing authors, of course, but the API of AdvancedHMC is nicely decoupled from Turing itself.
Again, what I am not convinced about is the use case of switching between MCMC, optimization
Well, I often want to sample a posterior, and then use an optimizer to find the global mode, for example (e.g. starting with the max-posterior sample). Not for MAP as the primary result - we usually absolutely want to have the credible region, but we want to know where the mode is, too. So here I want to run sampling and optimization on exactly the same target function. And then, I may want to integrate the same function (it’s exp, since the posterior returns it’s log, of course) to get the evidence. So here I run three sampling, optimization and integration on the same function. And some, all, or none of them may want a gradient or not, and I may want to use AD or not.
And sometimes, I may want to use more than one sampler (provided more than one can handle the problem) to sample a posterior, to ensure I can trust a result (before publication).
From a user point of view, it would be really nice to just write a plain function. And to have some standard mechanism to declare it returned a log-space value, and to attach a gradient or not, and aux info or not - in a standardized, simple return value/structure. One shouldn’t have to implement custom methods, define custom types, etc. just to define a likelihood function (just as an example). Of course complex cases may require something beyond that, but for many use cases that would be enough.
I just had a feeling that maybe the time was right for a lightweight API or conventions for density/loss/target-functions, since they are such a central concept. But maybe the time isn’t right just yet?