Problem with a hypergeometric Turing model

Hi all,

I’m trying to get a bit more into probabilistic programming with Turing.jl. I want to build a simple hypergeometric model. Imagine you have a large box with Nred red and Nblue blue balls (makes N = Nred + Nblue balls in total) and you draw n=10 balls out of this box (four times) and get (3, 7), (4, 6), (4, 6) and (5, 5) red and blue balls in the samples. I want to sample the 3d posterior of (N, Nred, Nblue) (2d effectively, due to the constraint), given this data.

Please see my attempt of a model specification in Turing below. Unfortunately I get an error, because sometimes Nred is sampled >N, although it should be upper-bounded by N, leading to Nblue<0. Nblue<0 then causes the error in the hypergeometric distribution. These are typical outcomes of the print statements:

new:
83
26
57
new:
89
12
77
new:
11
18
-7

Does anyone know why this goes wrong?

Turing model:

using Turing
using StatsPlots

@model hypergeo(nred) = begin
    # fixed parameters
    n = 10 # number of balls in the drawn sample

    # priors
    N ~ DiscreteUniform(n, 100) # balls in the total population
    Nred ~ DiscreteUniform(0, N) # red balls in the total population
    Nblue = N - Nred # blue balls in the total population

    println("new:")
    println(N)
    println(Nred)
    println(Nblue)

    # observations
    @. nred ~ Hypergeometric(Nred, Nblue, n)
end

#  run sampler, collect results
chn = sample(hypergeo([3,4,4,5]), MH(), 5000)

# summarise results
describe(chn)

# plot results
p = plot(chn)

Hi!

Your problem seems similar to https://github.com/TuringLang/Turing.jl/issues/1270. Turing currently doesn’t support stochastic support for prior distributions but there is a neat workaround using a custom distribution like the example shown here https://github.com/TuringLang/Turing.jl/issues/1270#issuecomment-627400309. In your case, the custom distribution will be a subtype of DiscreteMultivariateDistribution and you won’t need a bijector. This should probably work.

using Random, Distributions

struct MyDist <: DiscreteMultivariateDistribution
    n::Int
end
function Base.rand(rng::Random.AbstractRNG, d::MyDist)
    N = rand(rng, DiscreteUniform(d.n, 100))
    Nred = rand(rng, DiscreteUniform(0, N))
    return [N, Nred]
end
function Distributions.logpdf(d::MyDist, x::Vector{Int})
    return logpdf(DiscreteUniform(d.n, 100), x[1]) + logpdf(DiscreteUniform(0, x[1]), x[2])
end
1 Like

thanks for your answer! But I’m a bit confused now. If I adapt the “Hello World” example from Turing a bit like this:

# Define a simple Normal model with unknown mean and variance.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2, 3)
    m ~ Normal(s, 0.01)

    println("new:")
    println(s)
    println(m)
    
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
end

#  Run sampler, collect results
chn = sample(gdemo(1.5, 2), MH(), 5) # just 5 iteration for nice print output

I get the following output:

new:
2.326687447393811
2.330099688761129
new:
1.284186752463607
2.327211558566936
new:
0.9287803374710428
2.320031712933687
new:
2.334071342743244
2.3090054849030874
new:
0.8360725845595213
2.3227973835651685
new:
2.323124161080269
2.322218261627838

So basically m is sampled from a distribution with s's first realisation (2.326687447393811, i.e. m~N(2.326687447393811, 0.01) ) or what?

So generally, Turing does currently not support priors that depend on any other random variable?

No, prior distributions can depend on other random variables so long as the support of the distribution doesn’t. In your case, the support of the second distribution, i.e. the values that the distribution can sample, depends on the first random variable. In the tutorial, the support of the normal distriution [-Inf, Inf] doesn’t change when s changes.

Edit: the support of DiscreteUniform(0, N) is the integers between 0 and N.

yes, but what confuses me that even in my second example—where the support does not change since m ~ Normal(s, 0.01) will always have support [-Inf, Inf]—I think the samples for m and s still look weird.

In the second iteration we have

s = 1.284186752463607
m = 2.327211558566936

so s changes, but m still seems to be sampled from N(2.326687447393811, 0.01) (with the first realisation of s as the mean).

Well, you are not printing the samples for each iteration here. You print the draws from the proposal distribution. You might want to look at the returned chain object instead.

1 Like

Also can’t you write your initial problem in terms of Bernoulli draws?

Alright, got it! When I looked at chn.value, m seems to “adapt” to s as you would expect. Thanks.

So as a conclusion, I will then just keep this in mind when I’m effecting the support of my priors.

You mean modelling this situation with replacement (binomial/Bernoulli) instead of without replacement (hypergeometric)?

For this I am not so sure because I’m actually interested in these quantities in the total population like N, Nred, and without replacement might be a more appropriate model in my specific case.

edit:

But thanks a lot so far. I will think about this and also try to understand these custom distr approaches.

All you need is:

N_Nred ~ MyDist(n)
N = N_Nred[1]
Nred = N_Nred[2]