Running NUTS with a categorical prior?

Hello guys.

I’m new to Turing.jl and I’m having some trouble building my model. It is supposed to be a relatively simple model. I have a forward operator called “MT” that takes 3 vectors as input (d,rho,f) and outputs “expectedrho”. I am trying to do bayesian inference over the vector d. When I choose d to be a Uniform distribution, I can sample the posterior using NUTS for example. However, to make the code more efficient, I would like to pre-define the possible states of d. The closer that I’ve got is using a Caterorical Uniform distribution (check code below), but this code only runs with Particle Gibbs and MH. All the hamiltonian samplers fail.

Does anyone have experience sampling from categorical distributions? My problem comes down to: “The value can be anywhere between 1 and 5000, but for computational efficiency, only try with the integers”.

@model function mt_bayes_grandis(rhoref,d)
    #Prior distribution of the depth for the first layer
    f=10 .^range(-3,3,300)
    rho ~ filldist(Categorical(ones(5000) / 5000),k)
    expectedrho,expectedphi = MT_mod_wait_bayes_cells(d,rho,f)
    for i in eachindex(rhoref)
        rhoref[i] ~ Normal(expectedrho[i],10)
    return d

Thanks in advance for any comment you can share

Hamiltonian monte Carlo/NUTS requires continuous distributions, so this is not actually efficient.

No, HMC requires continuous parameters.

You can use discrete distributions in Turing.jl.
Check this tutorial on Categorical Regression: Bayesian Ordinal Regression

1 Like

Sorry I should have been more precise you’re absolutely right the parameters require continuous measures.

However in this case the OP is defining a parameter rho to have a categorical distribution and that’s why it won’t NUTS.

Oh I see, yes this won’t run in HMC/NUTS.
Either use another MCMC Sampler or marginalize out the discrete parameters, see Chapter 7 of the Stan’s User Guide.

I think actually it’s a misconception on the part of the OP that leads him to use the categorical parameter. He says he can get it to run using a Uniform continuous parameter, which makes sense to me. Then “to make the code more efficient” he wants to limit the prior to a discrete set of integers. This simply doesn’t make the code more efficient. so if there isn’t a logical reason that the parameter has to be discrete, it should just remain continuous.

1 Like

Thanks for your comments. My field of expertise is not computer science so I might use wrong words to explain myself.

The logic behind making the parameter discrete is that the value I am trying to invert can take any real value between 1 and 5000, however if I restrict the possible number of states to just integers, I would have only 5000 possible states, with little to no change change in the output of my forward operator.

But I don’t seem to find how.

If the number can be any possible real value, you should be using floats instead of integers. Using real numbers is almost always more computationally efficient than limiting yourself to integers, because you can’t take derivatives on the integers.

Sampling on discrete spaces is usually not very efficient boiling down to an essentially exhaustive combinatorial search. Well-behaved densities over continuous spaces often allow much more efficient methods, e.g., exploiting differential structure such as HMC. It’s a misconception that searching/sampling over a “smaller” space is more efficient, especially when the larger space has additional structure that is beneficial.

1 Like