I’m trying to get a bit more into probabilistic programming with Turing.jl. I want to build a simple hypergeometric model. Imagine you have a large box with
Nred red and
Nblue blue balls (makes
N = Nred + Nblue balls in total) and you draw
n=10 balls out of this box (four times) and get (3, 7), (4, 6), (4, 6) and (5, 5) red and blue balls in the samples. I want to sample the 3d posterior of (N, Nred, Nblue) (2d effectively, due to the constraint), given this data.
Please see my attempt of a model specification in Turing below. Unfortunately I get an error, because sometimes Nred is sampled >N, although it should be upper-bounded by N, leading to Nblue<0. Nblue<0 then causes the error in the hypergeometric distribution. These are typical outcomes of the print statements:
new: 83 26 57 new: 89 12 77 new: 11 18 -7
Does anyone know why this goes wrong?
using Turing using StatsPlots @model hypergeo(nred) = begin # fixed parameters n = 10 # number of balls in the drawn sample # priors N ~ DiscreteUniform(n, 100) # balls in the total population Nred ~ DiscreteUniform(0, N) # red balls in the total population Nblue = N - Nred # blue balls in the total population println("new:") println(N) println(Nred) println(Nblue) # observations @. nred ~ Hypergeometric(Nred, Nblue, n) end # run sampler, collect results chn = sample(hypergeo([3,4,4,5]), MH(), 5000) # summarise results describe(chn) # plot results p = plot(chn)