Draw from density in DynamicHMC ecosystem

DynamicHMC.jl (from @Tamas_Papp) and related packages take a log-density callable and sample parameter values with it. One feature I’ve seen in some other programs (Stan, Turing) is the ability to simulate data given a parameter vector or samples of parameter vectors for the purpose of “predictive checks”. Is there a convenient way to do this using the DHMC ecosystem? The examples I’m seeing are all getting parameter values from data rather than data from parameters.

1 Like

DynamicHMC does not care if your log density is a (log) prior, posterior, or something else, it just samples from it. So you can just code up the log prior and the log likelihood separately, and implement the posterior as the sum of the two.

Eg

function log_prior(params)
    ...
end

function log_likelihood(params)
   ...
end

log_posterior(params) = log_prior(params) + log_likelihood(params)

This is example is conceptual, let me know if you need a more detailed one with parameter transformations, model specifications, etc.

1 Like

That separation is nice. I think my question is on a different issue, though, I’ll try to explain.

The output that I’m looking for here is not parameter estimates but generated data. Given a prior distribution, I want to generate some data from that distribution to see if the prior generates data that looks similar enough to my real data, indicating that the prior is sensible. Basically rand(my_likelihood).

The docs’ worked example specifies a likelihood function and can estimate parameter values

sample :: likelihood -> data -> params

I’m wondering if there is a way to go the other way

rand :: likelihood -> params -> data

or if I need to write an rng/sampler for my likelihood manually.

Sorry, I misunderstood then.

No, DynamicHMC does not have facilities for simulation, it takes the posterior as a black box. You will either need to use other packages that walk the DAG, or just implement the simulation code yourself. Typically, it is much less algorithmically involved than MCMC since most models feature distributions for which efficient RNGs exist (eg normal, etc).

2 Likes

Well, in principle you could use it to sample from different (conditional) densities. Here is a simple example

function sampleit(prob, trans)
    P = TransformedLogDensity(trans, prob)
    ∇P = ADgradient(:ForwardDiff, P)
    mcmc_with_warmup(Random.default_rng(), ∇P, 1000, reporter = NoProgressReport())
end

# My problem, i.e., log p(x, μ, σ)
function logp(xs, μ, σ)
    logprior = logpdf(Normal(0, 10), μ) + logpdf(truncated(TDist(3); lower = 0), σ)
    loglikelihood = sum(logpdf.(Normal(μ, σ), xs))
    logprior + loglikelihood
end

# Some demo data
data = rand(Normal(3.2, 0.23), 100)

# Posterior sample p(μ, σ | x)
sampleit(θ -> logp(data, θ.μ, θ.σ), as((μ = asℝ, σ = asℝ₊)))

# Sampling distribution p(x | μ, σ)
sampleit(x -> logp(x, -1.2, 0.8), as(Vector, 10))

# Fancy conditional p(x', μ | x, σ)
sampleit(z -> logp(vcat(data, z.y), z.μ, 0.8), as((y = as(Vector, 2), μ = asℝ)))

I’m a huge fan of TransformVariables, LogDensityProblems, DynamicHMC etc, as they are just so composable and easily integrate into Optim, Flux etc.

On the other hand, as @Tamas_Papp has already said, there are more efficient ways to sample from p(x | μ, σ) or other simple conditional distributions. HMC is mainly useful to sample from the posterior when no (analytic) alternative is available. Furthermore, HMC does not handle discrete parameters.

2 Likes

Cool, that does exactly what I want, for small n. As you suggest, it seems to be too slow to use for n>500 or so, on that example.

Well, if you have enough parameters to sample, you want to switch to a reverse-mode AD backend instead of ForwardDiff. Fortunately, that is easy to do as well:

function sampleit(prob, trans, ad; kwargs...)
    P = TransformedLogDensity(trans, prob)
    ∇P = ADgradient(ad, P; kwargs...)
    mcmc_with_warmup(Random.default_rng(), ∇P, 1000, reporter = NoProgressReport())
end
julia> using BenchmarkTools

julia> function demo(n)
           μ, σ = -1.2, 0.3
           fun(x) = logp(x, μ, σ)
           trans = as(Vector, n)
           @btime sampleit($fun, $trans, :ForwardDiff)
           @btime sampleit($fun, $trans, :ReverseDiff; compile = Val(true))
           @btime sampleit($fun, $trans, :Enzyme)
           @btime rand(Normal($μ, $σ), $n)
           :done
       end
demo (generic function with 1 method)

julia> demo(2)
  16.583 ms (238853 allocations: 11.03 MiB)
  12.809 ms (188462 allocations: 8.35 MiB)
  6.729 ms (253801 allocations: 10.90 MiB)
  34.819 ns (2 allocations: 80 bytes)
:done

julia> demo(20)
  102.721 ms (612501 allocations: 207.57 MiB)
  78.485 ms (426749 allocations: 44.70 MiB)
  28.214 ms (568791 allocations: 60.98 MiB)
  139.572 ns (2 allocations: 224 bytes)
:done

julia> demo(200)
  11.952 s (4924248 allocations: 26.96 GiB)
  1.461 s (863422 allocations: 570.96 MiB)
  392.569 ms (1106730 allocations: 797.97 MiB)
  580.409 ns (2 allocations: 1.62 KiB)
:done

Especially Enzyme is often really fast. But, no matter what, HMC cannot compete with simply drawing from a normal distribution in that simple case.