I just discovered that nesting several submodels in Turing.jl seems to considerably slow down sampling (I only tried Mooncake.jl so far).
The code below shows this with the eight schools dataset. I’ve deliberately made the sampling more expensive than it has to be to better show the speed discrepancy.
On my machine, the non-nested-submodel variant comes in at 4-5s, while the nested-submodel variant takes 30-35s. Not using any submodels is on par with the non-nested version.
I ran these on four threads, and took timings on the second (and later) runs.
Is this expected? I wanted to ask here before opening an issue against Turing.jl (maybe @penelopeysm knows?). I briefly searched Turing.jl and Mooncake.jl issues but found nothing related. Apologies if I missed something!
The example here is quite contrived of course, but in my actual modelling scenario the nested submodels would be a nice simplification.
# Example adapted from
# https://gist.github.com/penelopeysm/5656697ea20c94d80a285f5f6a69b8ab
using Turing, Mooncake, LinearAlgebra
# Eight schools data
J = 8
y = [28, 8, -3, 7, -1, 1, 18, 12]
sigma = [15, 10, 16, 11, 9, 11, 10, 18]
# Submodels
@model function tau_prior()
p ~ truncated(Cauchy(0, 5); lower = 0)
return (; p)
end
@model function mu_prior()
p ~ Normal(0, 5)
return (; p)
end
@model function z_prior(J)
p ~ MvNormal(zeros(J), I)
return (; p)
end
@model function theta_prior(J)
mu ~ to_submodel(mu_prior())
tau ~ to_submodel(tau_prior())
z ~ to_submodel(z_prior(J))
p := z.p .* tau.p .+ mu.p
return (; p)
end
# Nested submodel
@model function esc_subm_nested(J, y, sigma)
theta ~ to_submodel(theta_prior(J))
for i in 1:J
y[i] ~ Normal(theta.p[i], sigma[i])
end
end
# Same sampler for both models, deliberately high adapt_delta
sampler = NUTS(10000, 0.99; adtype=AutoMooncake())
model1 = esc_subm_nested(J, y, sigma)
chn1 = sample(model1, sampler, MCMCThreads(), 5000, 8)
# Single submodel
@model function esc_subm(J, y, sigma)
mu ~ to_submodel(mu_prior())
tau ~ to_submodel(tau_prior())
z ~ to_submodel(z_prior(J))
theta := z.p .* tau.p .+ mu.p
for i in 1:J
y[i] ~ Normal(theta[i], sigma[i])
end
end
model2 = esc_subm(J, y, sigma)
chn2 = sample(model2, sampler, MCMCThreads(), 5000, 8)
# No submodel
@model function esc(J, y, sigma)
mu ~ Normal(0, 5)
tau ~ truncated(Cauchy(0, 5); lower = 0)
z ~ MvNormal(zeros(J), I)
theta := z .* tau .+ mu
for i in 1:J
y[i] ~ Normal(theta[i], sigma[i])
end
end
model3 = esc(J, y, sigma)
chn3 = sample(model3, sampler, MCMCThreads(), 5000, 8)
These are the package versions used above:
(nested_submodels_mooncake) pkg> st
Status `~/Documents/Snippets/nested_submodels_mooncake/Project.toml`
[da2b9cff] Mooncake v0.5.32
[fce5fe82] Turing v0.45.0
[37e2e46d] LinearAlgebra v1.12.0