Trouble translating a simple PyMC3 model to Turing

dlakelan · November 21, 2021, 7:44pm

This is more a role for the summary function on the Chains type. But yeah, agreed.

I keep getting a segfault:


signal (11): Segmentation fault
in expression starting at REPL[6]:1
getproperty at ./task.jl:170 [inlined]
produce at /home/dlakelan/.julia/packages/Libtask/8YyQv/src/ctask.jl:154
#1 at /home/dlakelan/.julia/packages/AdvancedPS/31ABw/src/container.jl:12 [inlined]
#8 at /home/dlakelan/.julia/packages/Libtask/8YyQv/src/ctask.jl:85
unknown function (ip: 0x7ffb9cc85dbf)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2245 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2427
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1787 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:878
Allocations: 120671065 (Pool: 120653739; Big: 17326); GC: 188
Segmentation fault

running some version of 1.7 I should maybe update it to a more recent one.

Still getting a segfault with 1.7.0-rc3 and latest Turing etc, seems to be a libtask issue, ugh!

ParadaCarleton · November 21, 2021, 9:27pm

Turing doesn’t work with 1.7.0 yet, unfortunately.

dlakelan · November 21, 2021, 9:29pm

Well, “doesn’t work” seems a bit overstating. I’ve been using it successfully for months, but yeah seems like some functionality isn’t 100%

ParadaCarleton · November 21, 2021, 9:42pm

It’s not just that – Turing will behave erratically and force-downgrade to version 0.15 IIRC. (Unfortunately, 0.15.0 did not set an upper bound on the newest possible version of Julia, and this can’t really be fixed.)

In reality, I’m kind of shocked it’s working at all, and this might explain some of your bugs. 1.6.4 is the newest non-beta version of Julia, and the newest version supported by Turing. (We hope to fix this in the future by removing our dependence on Libtask, which is tied to specific Julia versions.)

dlakelan · November 21, 2021, 9:46pm

Gotcha. Thanks for that warning. Unfortunately 1.6 is unusable on my computer due to the old way of doing Registry handling. I pinned Libtask to #master and Turing to 0.19.0 and that allowed the install, it runs and gives meaningful inference in many cases… but I’ll be sure not to do anything important and final until a later version is available.

What exactly does Libtask do for Turing? and what does it take to remove that dependency?

ParadaCarleton · November 21, 2021, 9:56pm

Libtask is necessary for task-copying, which we use in our SMC/Particle Gibbs implementations; this does explain why PG+SMC are broken. If Turing is pinned to 0.19.0, any non-particle samplers should work.

If SMC/PG are broken for you, you can try Metropolis-Hastings, which should work better.

dlakelan · November 21, 2021, 10:03pm

That explains it then, this is the first time I’ve used PG or SMC. most of what I’ve done is NUTS and a little MH thrown in.

How does MH handle discrete parameters?

dlakelan · November 21, 2021, 10:35pm

Ok on the basis of this discussion and some searching on the Turing manual pages… It seems like this should work:

trace9 = sample(model9(k01, k10, k11), Gibbs(NUTS(1000,.8,:p0,:p1),MH(:N => RandomWalkProposal(DiscreteUniform(-2,+2)))),1000; init_params=[.6,.2,134],thinning=100)

and yet it’s stuck and doesn’t move at all. am I doing something wrong? @ParadaCarleton

ParadaCarleton · November 21, 2021, 11:00pm

First, I’d disable thinning; thinning only helps if you’re constrained by memory requirements, and I doubt you have a chain that’s long enough that you can’t store it in your computer’s memory. This is also some very aggressive thinning; taking only 1 out of every 100 samples is pretty crazy. This should only leave you with 10 samples at the end.

Also, you should probably be using RWMH (Random-walk metropolis-hastings) and not MH.

dlakelan · November 21, 2021, 11:09pm

That doesn’t seem to be the way thinning works, it runs 100000 samples and takes every 100’th one to give me a chain with 1000 samples. The goal was to have it spend a lot of time trying to actually do something useful, but not have to see 100k samples of nothing. In the end, I got 1000 samples of nothing. it never moved

On the other hand, this does work:


@model function test2(Nobs)
    p ~ Beta(10,10)
    N ~ DiscreteUniform(Nobs,200)
    Nobs  ~ Binomial(N,p)
end

ch = sample(test2(22),Gibbs(NUTS(1000,.8,:p),MH(:N => RandomWalkProposal(DiscreteUniform(-1,1)))),1000;init_params=[.4,21],thinning=40)

plot(ch)

If I use RWMH(:N) it fails with:

julia> ch = sample(test2(22),Gibbs(NUTS(1000,.8,:p),RWMH(:N)),1000;init_params=[.4,21],thinning=40)
ERROR: MethodError: no method matching getspace(::MetropolisHastings{RandomWalkProposal{false, Symbol}})
Closest candidates are:
  getspace(::Union{DynamicPPL.SampleFromPrior, DynamicPPL.SampleFromUniform}) at ~/.julia/packages/DynamicPPL/RcfQU/src/sampler.jl:9
  getspace(::HMCDA{<:Any, space}) where space at ~/.julia/packages/Turing/nfMhU/src/inference/Inference.jl:440
  getspace(::MH{space}) where space at ~/.julia/packages/Turing/nfMhU/src/inference/Inference.jl:437
  ...

ParadaCarleton · November 21, 2021, 11:12pm

Oh I see, sorry about that; I must have confused Turing with some other PPL’s thinning.

Was the problem with the initialization, then? It looks like you passed 2 parameters in this case, instead of 3.

dlakelan · November 21, 2021, 11:13pm

This is a different simpler model, with one continuous and one discrete parameter. I don’t know about that other proposed model above. I don’t know why it won’t move, it could be legitimately nearly impossible to sample from, or it could be a software issue.

bdeonovic · November 22, 2021, 4:03pm

Just a sanity check:
are we sure that the parameterization from PyMC3 pm.HyperGeometric('y', N=N, k=k, n=n, observed=x) is equivalent to Turing’s Hypergeometric(k, N - k, n)?

dodoplus · November 22, 2021, 5:28pm

It should be.
The PyMC3 documentation says:

The probability of x successes in a sequence of n bernoulli trials taken without replacement from a population of N objects, containing k good (or successful or Type I) objects.

while Julia’s Distribution docs say:

A Hypergeometric distribution describes the number of successes in n draws without replacement from a finite population containing s successes and f failures

In other words, N=s+f. In Julia we pass s and f and in Python we pass N and s.

dlakelan · November 22, 2021, 8:33pm

Ok In this multinomial model I think I’ve discovered one problem:

@model function model9(k01, k10, k11)
    p0 ~ Beta(1, 1)
    p1 ~ Beta(1, 1)
    num_seen = k01 + k10 + k11
    N ~ DiscreteUniform(num_seen, 350)
    
    q0 = 1-p0
    q1 = 1-p1
    ps = [q0*q1, q0*p1, p0*q1, p0*p1]
    
    y = [N - num_seen, k01, k10, k11]
    y ~ Multinomial(N, ps)
end

by looking at the chain output, I see that y is NOT treated like data, instead y is overwritten with samples from the Multinomial. This is because y doesn’t appear in the arguments to the model function (and it can’t obviously).

So I modified it as follows:

@model function model9(k01, k10, k11)
    p0 ~ Beta(1, 1)
    p1 ~ Beta(1, 1)
    num_seen = k01 + k10 + k11
    N ~ DiscreteUniform(num_seen, 350)
    
    q0 = 1-p0
    q1 = 1-p1
    ps = [q0*q1, q0*p1, p0*q1, p0*p1]
    
    y = [N - num_seen, k01, k10, k11]
    Turing.@addlogprob!(logpdf(Multinomial(N, ps),y))
end

and now using:

trace9 = sample(model9(k01, k10, k11), Gibbs(MH(:N => RandomWalkProposal(DiscreteUniform(-1,1))),NUTS(1000,.8,:p0,:p1)),1000; init_params=[.09,.04,152],thinning=10)

it samples as follows:

Or with a bit more samples and thinning = 100

dlakelan · November 22, 2021, 10:49pm

Also for the original model, the following seems to work:


k = 23
n = 19
x = 4

@model function model6(y,k,n)
    N ~ DiscreteUniform(50, 500)
    y ~ Hypergeometric(k, N - k, n)
end


trace6 = sample(model6(x,k,n), MH(:N => AdvancedMH.RandomWalkProposal(DiscreteUniform(-1,1))), 1000; thinning=1000);

plot(trace6)

Chains MCMC chain (1000×2×1 Array{Float64, 3}):

Iterations        = 1:1000:999001
Number of chains  = 1
Samples per chain = 1000
Wall duration     = 2.79 seconds
Compute duration  = 2.79 seconds
parameters        = N
internals         = lp

Summary Statistics
  parameters       mean       std   naive_se      mcse       ess      rhat   ess_per_sec 
      Symbol    Float64   Float64    Float64   Float64   Float64   Float64       Float64 

           N   193.0430   94.2292     2.9798   14.4653   24.1587    1.0606        8.6466

Quantiles
  parameters      2.5%      25.0%      50.0%      75.0%      97.5% 
      Symbol   Float64    Float64    Float64    Float64    Float64 

           N   70.9750   125.0000   170.0000   235.2500   426.0500

Or using a wider proposal, for much more effective sampling:


trace6 = sample(model6(x,k,n), MH(:N => AdvancedMH.RandomWalkProposal(DiscreteUniform(-10,10))), 1000; thinning=100);

plot(trace6)
trace6

Chains MCMC chain (1000×2×1 Array{Float64, 3}):

Iterations        = 1:100:99901
Number of chains  = 1
Samples per chain = 1000
Wall duration     = 0.29 seconds
Compute duration  = 0.29 seconds
parameters        = N
internals         = lp

Summary Statistics
  parameters       mean       std   naive_se      mcse        ess      rhat   ess_per_sec 
      Symbol    Float64   Float64    Float64   Float64    Float64   Float64       Float64 

           N   165.4380   80.1469     2.5345    7.3142   114.7284    1.0002      391.5646

Quantiles
  parameters      2.5%      25.0%      50.0%      75.0%      97.5% 
      Symbol   Float64    Float64    Float64    Float64    Float64 

           N   66.9750   109.0000   144.0000   200.2500   386.0500

ParadaCarleton · November 23, 2021, 2:44am

I’m glad this works! That’s definitely some very weird and unexpected behavior.

@yebai do you know what’s up with this? Looks like a bug with how @model handles ~.

dlakelan · November 23, 2021, 2:53am

I’m not 100% sure but I think this is a feature or a intentional design decision. In general the left hand side of ~ is considered a parameter unless it appears in the arguments of the model function.

I’m not clear on what the intended way to handle your example problem is. However the @addlogprob! is a general purpose way to handle stuff like this.

ParadaCarleton · November 23, 2021, 3:19am

~ should work even if it’s not in the arguments – actually, we’re starting to work on moving data out of model arguments. If y is already defined, it seems very weird to me for y to be overwritten by a sampled variable.

dlakelan · November 23, 2021, 4:23am

Where would the data go instead?

Topic		Replies	Views
How to sample from estimated posterior distribution? New to Julia turing	1	524	July 3, 2021
Migrating simple Poisson/Gamma model from Pymc3 General Usage turing	3	778	December 22, 2020
Help converting pymc3 switchpoint analysis code to Turing.jl New to Julia question , turing , machine-learning	4	776	July 31, 2021
Different samples distribution after execution Probabilistic programming question , turing	5	621	September 24, 2020
Implementation of NUTS \| Translating Stan model to Julia \| JuliaConnectoR New to Julia turing , r	17	1552	January 12, 2022

Trouble translating a simple PyMC3 model to Turing

Related topics