Multi-level varying slopes with two clusters, cross classification

sethaxen · May 30, 2021, 1:25am

A few things. First, your Stan and Turing models aren’t the same. It looks like you’re using a different prior for the sigmas and the rhos than the Stan model.

Then, as pointed out by @michielver, the loops can be eliminated by using fill_dist. Additionally, the computation of logit_p allocates 4 vectors, but this could be done with just one allocation.

Lastly, you’re passing a higher adapt_delta of 0.95 to the Turing model; higher adapt_deltas cause slower exploration. adapt_deltas that high are really conservative and usually only necessary when diagnosing sampling problems. Turing defaults to 0.65, which may provide faster exploration but at the risk of more divergences, while Stan defaults to 0.8. Probably anything between 0.7 and 0.8 is fine.

I tested a few variants, and the below example, which includes the above suggestions, runs on my old-ish machine in ~15 minutes. This is slower than Stan for this example, but within the same order of magnitude.

using CSV, TuringModels, DataFrames, Turing, LinearAlgebra, Random, ReverseDiff, Memoization
Turing.setadbackend(:reversediff)
Turing.setrdcache(true)

data_path = joinpath(TuringModels.project_root, "data", "chimpanzees.csv");
df = CSV.read(data_path, DataFrame; delim=';');

df.block_id = df.block;
df.treatment = 1 .+ df.prosoc_left .+ 2*df.condition;

quad_form_diag(M, v) = Symmetric((v .* v') .* (M .+ M') ./ 2)

@model function gdemo(actor, block, treatment, pulled_left)
    σ_block ~ filldist(Exponential(1), 4)
    σ_actor ~ filldist(Exponential(1), 4)
    ρ_block ~ LKJ(4, 4)
    ρ_actor ~ LKJ(4, 4)
    g ~ MvNormal(4, 1)
    Σ_block = quad_form_diag(ρ_block, σ_block)
    Σ_actor = quad_form_diag(ρ_actor, σ_actor)
    β ~ filldist(MvNormal(Σ_block), 6)
    α ~ filldist(MvNormal(Σ_actor), 7)
    logit_p = broadcast(treatment, actor, block) do t, a, b
        return @inbounds g[t] + α[t, a] + β[t, b]
    end
    pulled_left .~ BinomialLogit.(1, logit_p)
end

rng = MersenneTwister(3702)
@time chns = sample(
    rng,
    gdemo(df.actor, df.block_id, df.treatment, df.pulled_left),
    NUTS(1_000, 0.9),
    1_000
);

Topic		Replies	Views
Case study: Speeding up a logistic regression with RHS prior (Turing vs Numpyro) - any tricks I'm missing? Probabilistic Programming	38	3550	February 6, 2023
Making Turing Fast with large numbers of parameters? General Usage turing	127	10201	April 9, 2023
Speed up multilevel model in Turing.jl Probabilistic Programming turing	15	583	June 17, 2023
Turing: help with slow model Probabilistic Programming	46	2906	February 13, 2021
Slow Turing.jl sampling compared to python pymc Probabilistic Programming turing	7	262	June 26, 2025

Multi-level varying slopes with two clusters, cross classification

Related topics