Chains vs iterations behavior in Turing sampler (conditional priors)

PeX · May 20, 2023, 3:47pm

I wrote a simple model that looks like this:

@model function stam()
    # Prior distributions
    op = [1.0, 2.0, 3.0, 4.0]
	α ~ DiscreteUniform(1, length(op))
	β ~ truncated(Normal(op[α] , 0.1); lower=0.01, upper=10.0)
    return nothing
end

Basically, I want each iteration of the sampler to “choose” from op the mean value of the parameter β. However when I ran it using:

model2 = stam()
chain2 = sample(model2, MH(), MCMCSerial(), 20_000, 3; progress=true)

The result is:

Is there a way to make the sampler choose a different α on each iteration? I don’t understand why it chooses a single α for each chain, instead of each iteration.

Thank you!

Christopher_Fisher · May 20, 2023, 9:07pm

There used to be an example of mixture models in the Turing tutorials, but the link no longer works. Unfortunately, I do not remember the approach they used. You could potentially use MCMCTempering.jl, which deals with multimodal distributions. I don’t know how well it is integrated with Turing however.

PeX · May 20, 2023, 11:08pm

Thank you! However, I still don’t understand what’s going on under the hood that α is not being sampled under each iteration. Any clarification on this will be useful

Dan · May 21, 2023, 12:12am

Possibly, each iteration the sampler either resamples β or α. Resampling β keeping the same α works as expected. But resampling α while keeping β has very low probability because SD 0.1 is small. Essentially, the state space is split into several weakly connected basins. A simple way to test this hypothesis is to increase SD to 0.5 and see the chains switch α values.

Christopher_Fisher · May 21, 2023, 12:31pm

I think Dan is probably right. The sampler seems to get stuck on one of the mixture components. If you are estimating parameters from data, you could marginalize the mixture components in this case. For example, the likelihood for one observation could be:

    y = 2.0

    μs = rand(Normal(0, 1), 4)

    αs = rand(Dirichlet(ones(4)))

    likelihoods = αs .* pdf.(Normal.(μs, 1), y)

You use logsumexp to change to a logpdf. Marginalizing is preferred when possible because it leads to more efficient sampling.

PeX · May 21, 2023, 3:24pm

Thanks, both of you! Very insightful comments!
Any chance you can expand the marginalization example you wrote a little? I never did that on Turing.jl and couldn’t find good examples. I’m not sure how to define the variables and the likelihood for the simple case I wrote.
Thank you!!

Christopher_Fisher · May 21, 2023, 4:12pm

No problem. I am working on a solution using MixtureModel which does not require you to handle the logpdf.

using Distributions 
using Turing
using Random 

Random.seed!(584)

# number of observations
n_obs = 100
# number of components
n_c = 4
# sample true μ for each component
μs = randn(n_c)
# true standard deviation of each component
σ = 1
# true component probabilities 
θs = rand(Dirichlet(ones(n_c)))
mixture = MixtureModel(Normal.(μs, σ), θs)
# generate some simulated data 
data = rand(mixture, n_obs)

@model function my_model(y, n_c, n_obs)
    μs ~ filldist(Normal(0, 1), n_c)
    θs ~ Dirichlet(ones(n_c))
    σ = 1.0
    y .~ MixtureModel(Normal.(μs, σ), θs)
end

chains = sample(my_model(data, n_c, n_obs), NUTS(1000, .65), MCMCThreads(), 1000, 4)

PeX · May 21, 2023, 7:06pm

Thank you!! it’s really helpful!

Christopher_Fisher · May 21, 2023, 9:02pm

No problem. The example above now works. Unfortunately, the error was quite stupid: I intended to pass data to the model, but I passed an old variable y, which was lurking in my session. I was discussing parameter recovery with someone on slack and they said that parameters are often difficult to recover in mixture models.

Evidently, the Turing website has changed. You can find some more info here:

You can also find information about identifiability here:

https://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html

Topic		Replies	Views
How to extract sampled values from a Turing model General Usage	1	329	December 27, 2022
[ANN] Turing.jl Breaking Changes Probabilistic programming	1	1394	September 12, 2019
Function for resuming mcmc chain Probabilistic programming question , turing , mcmc	13	846	September 12, 2024
Turing.jl Bayesian SDE tutorial doesn't sample posterior chain Probabilistic programming	0	98	April 6, 2024
Posterior predictive distributions in Turing General Usage	6	2831	September 29, 2022

Chains vs iterations behavior in Turing sampler (conditional priors)

Related topics