I’m testing out vectorization in Turing models. I’ve managed to get a model that too 195 seconds, down to 60 seconds. But I wonder if there is anything obvious I’m missing in order to get even further benefits?
using Turing, StatsPlots
n_chains = 8
# Control
num_errors = [15,10, 61,11, 60, 44, 63, 70, 57,11, 67, 21, 89,12, 63, 11, 96,10, 37,19, 44,18, 78, 27, 60,14]
nᶜ = [89,74,128,87,128,121,128,128,128,78,128,106,128,83,128,100,128,73,128,86,128,86,128,100,128,79]
kᶜ = nᶜ - num_errors
# ADHD
num_errors = [88, 50, 58,17, 40, 18,21, 50, 21, 69,19, 29,11, 76, 46, 36, 37, 72,27, 92,13, 39, 53, 31, 49, 57,17,10,12,21, 39, 43, 49,17, 39,13, 68, 24, 21,27, 48, 54, 41, 75, 38, 76,21, 41, 61,24, 28,21]
nᵃ = [128,128,128,86,128,117,89,128,110,128,93,107,87,128,128,113,128,128,98,128,93,116,128,116,128,128,93,86,86,96,128,128,128,86,128,78,128,111,100,95,128,128,128,128,128,128,98,127,128,93,110,96]
kᵃ = nᵃ - num_errors
Φ(x) = cdf(Normal(0, 1), x)
Original model with loops
@model function model1(kᶜ, kᵃ, nᶜ, nᵃ)
δ ~ Normal(0, 1)
μ ~ Normal(0, 1)
σ ~ Uniform(0, 10)
α = δ * σ
ϕᶜ = Vector(undef, length(kᶜ))
θᶜ = Vector(undef, length(kᶜ))
ϕᵃ = Vector(undef, length(kᵃ))
θᵃ = Vector(undef, length(kᵃ))
for i in eachindex(kᶜ)
ϕᶜ[i] ~ Normal(μ + (α / 2), σ)
θᶜ[i] = Φ(ϕᶜ[i])
kᶜ[i] ~ Binomial(nᶜ[i], θᶜ[i])
end
for j in eachindex(kᵃ)
ϕᵃ[j] ~ Normal(μ - (α / 2), σ)
θᵃ[j] = Φ(ϕᵃ[j])
kᵃ[j] ~ Binomial(nᵃ[j], θᵃ[j])
end
end
Vectorised model
@model function model2(kᶜ, kᵃ, nᶜ, nᵃ)
δ ~ Normal(0, 1)
μ ~ Normal(0, 1)
σ ~ Uniform(0, 10)
α = δ * σ
# loop over elements of kᶜ
ϕᶜ ~ filldist(Normal(μ + (α / 2), σ), length(kᶜ))
θᶜ = Φ.(ϕᶜ)
@. kᶜ ~ Binomial(nᶜ, θᶜ)
# loop over elements of kᵃ
ϕᵃ ~ filldist(Normal(μ - (α / 2), σ), length(kᵃ))
θᵃ = Φ.(ϕᵃ)
@. kᵃ ~ Binomial(nᵃ, θᵃ)
end
Timing
Timing on an 8 core iMac, with 5000 samples per chain
Original model
@time posterior_chain = sample(model1(kᶜ, kᵃ, nᶜ, nᵃ), HMC(0.01, 10), MCMCThreads(), 5_000, n_chains);
Takes about 195 seconds
Vectorized model
@time posterior_chain = sample(model2(kᶜ, kᵃ, nᶜ, nᵃ), HMC(0.01, 10), MCMCThreads(), 5_000, n_chains);
Takes about 60 seconds
Questions
- Does that vectorized model look about right, or are there obvious rooms for improvement?
Is this the correct set of heuristics?
- use
filldist
for priors - use
f.(x)
notation for deterministic variables - use
@.
notation for likelihoods
And just by way of commentary… after initial reluctance with vectorisation (I felt loops were more explicit) I’m convinced that the model specification is more concise and nicer.