Hi everyone,
What is the recommended way to extract the log density and build gradients from a Turing model? (DynamicPPL)
Use cases: eg, feeding it to something like Pathfinder.jl (if it didn’t Turing API) or novel samplers like ZigZagBoomerang
I have gone through the Turing tutorials, docs, and Discourse - either there is no reference or the answers are outdated. I’m hoping for someone to say “This is the way!”
This is the best I found but the trade-offs aren’t clear to me:
#1 From Discourse via Turing.gradient_logp – Removed
## from https://discourse.julialang.org/t/gradient-of-latent-space-variable-in-turing/60941/2?u=svilupp
∇logp(x) = Turing.gradient_logp(Turing.ZygoteAD(), x, Turing.VarInfo(cond_model), cond_model) # not working
# removed in Aug-22, see https://github.com/TuringLang/Turing.jl/pull/1877
#2 Ala TuringBenchmark.jl with LogDensityProblems
# Code Tor's benchmarking suite, from https://github.com/torfjelde/TuringBenchmarking.jl
ad_backend=ForwardDiffAD{40}()
vi_orig = DynamicPPL.VarInfo(cond_model)
spl = DynamicPPL.SampleFromPrior()
vi_current = DynamicPPL.VarInfo(DynamicPPL.VarInfo(cond_model), spl, vi_orig[spl])
# DynamicPPL.link!(vi, spl) # unclear when to use it?
f = LogDensityProblems.ADgradient(
ad_backend,
Turing.LogDensityFunction(vi_current, cond_model, spl, DynamicPPL.DefaultContext())
)
theta = vi_current[spl]
logp(x) = LogDensityProblems.logdensity_and_gradient(f, x)[1]
∇logp(x) = LogDensityProblems.logdensity_and_gradient(f, x)[2]
@info "LogDensityFunction way:" logp(theta) ∇logp(theta)
#3 via SciML Optimization Problem
# From Pathfinder.jl code base: https://github.com/mlcolab/Pathfinder.jl/blob/main/src/optimize.jl and https://github.com/mlcolab/Pathfinder.jl/blob/main/src/integration/turing.jl
# requires AbstractDifferentiation package
prob = Turing.optim_problem(cond_model, Turing.MAP(); constrained=false, init_theta=theta)
# prob.prob is the OptimProblem
ad_backend=AD.ForwardDiffBackend()
logp(x) = -prob.prob.f.f(x, nothing)
∇logp(x) = only(AD.gradient(ad_backend, logp, x))
@info "OptimizationProblem way:" logp(theta) ∇logp(theta)
Set up code
using Random
using Turing
using LazyArrays
using LogDensityProblems
using Turing.Essential: ForwardDiffAD
using AbstractDifferentiation # needed for opt #3
# Generate mock data
X=hcat(ones(200),randn(200,3))
betas=randn(4)
@assert length(betas)==size(X,2)
y=(X*betas) .+ randn(200)
# create a toy model and condition it
@model function linear_model(X)
dim_X=size(X,2)
betas ~ filldist(Normal(),dim_X)
sigma ~ Exponential(1)
y ~ arraydist(LazyArray(@~ Normal.(X*betas,sigma)))
end;
cond_model=linear_model(X) | (;y);
theta=ones(5)