Numerical errors in logit normal model using Turing.jl

Alternatively you can do:

@model normal_logit(n, x) = begin

    n_obs = length(n)

    ## hyperparameters for varying effectgs
    ā ~ Normal(0, 0.5)
    σ_a ~ Truncated(Exponential(3), 0, Inf)

    # varying intercepts for each observation
    a ~ MvNormal(fill(ā, n_obs), σ_a)

    x ~ VecBinomialLogit(n, a)
end

which will likely be faster. You can get some more performance by using FillArrays.jl.

using FillArrays
@model normal_logit(n, x) = begin

    n_obs = length(n)

    ## hyperparameters for varying effectgs
    ā ~ Normal(0, 0.5)
    σ_a ~ Truncated(Exponential(3), 0, Inf)

    # varying intercepts for each observation
    a ~ MvNormal(Fill(ā, n_obs), σ_a)

    x ~ VecBinomialLogit(n, a)
end

which will be more memory efficient due to less allocations.

It’s a bit inconvenient that we call a multivariate version of BinomialLogit a VecBinomialLogit. Besides the doc string is pretty bad. I’ll open a PR fixing those issues.

It does! So the problem was actually with trying to make the model type stable in the first place? Are there any guidelines for when that is a good idea? That is something I only learned about here, not in the docs.

The model is not so very exotic; it is sometimes called a “logit normal” model, where the parameter for the Binomial distribution is modelled as an average constant plus some normally distributed error on the logit scale:

Screenshot%20from%202019-11-08%2014-32-50

indeed, sampling works fine (without errors) for me, but seems to take two hours. In Stan I was able to draw three times as many samples in less than 5 min

Try the last example I posted. Takes around 10 seconds on my 10 years old Macbook. :smiley:

Thank you very much! I admit I tried hard to use VegBinomialLogit and did not succeed.
Here I am having a very similar error to my previous one:

MethodError: no method matching VecBinomialLogit(::Array{Int64,1}, ::TrackedArray{…,SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}})

output of st is

    Status `~/Documents/projects/MREs/Turing_binomial/Project.toml`
  [31c24e10]   Distributions v0.21.6
  [1a297f60] + FillArrays v0.8.0
    Status `~/Documents/projects/MREs/Turing_binomial/Manifest.toml`
  [1a297f60] + FillArrays v0.8.0

This and that you called the sampling with the xs and ps instead of ns and xs.

Type-stability always helps to make execution of an Julia program faster. This is nothing PPL related. If the model is not type-stable the whole type inference during sampling makes the computation expensive.

We are still having a performance tips page on the todo list.
@mohamed82008 , are you still working on it? Do you need some help?

Some general tips:
You should try to use multivariate distributions if possible as this will reduce allocation and computation costs. FillArrays often helps if you have a situation as in your model.
If you cannot express it using a multivariate distribution, you should use the typed syntax I proposed in one of the first posts.

Please do the following for now. I’ll included this into the PR. Sorry for the inconvenience!

using FillArrays

"""
    MvBinomialLogit(n::Vector{<:Int}, logpitp::Vector{<:Real})

A multivariate binomial logit distribution with `n` trials and logit values of `logitp`.
"""
struct MvBinomialLogit{T<:Real, I<:Int} <: DiscreteMultivariateDistribution
    n::Vector{I}
    logitp::Vector{T}
end

Distributions.length(d::MvBinomialLogit) = length(d.n)

function Distributions.logpdf(d::MvBinomialLogit{<:Real}, ks::Vector{<:Integer})
    return sum(Turing.logpdf_binomial_logit.(d.n, d.logitp, ks))
end

@model normal_logit(n, x) = begin

    n_obs = length(n)

    ## hyperparameters for varying effectgs
    ā ~ Normal(0, 0.5)
    σ_a ~ Truncated(Exponential(3), 0, Inf)

    # varying intercepts for each observation
    a ~ MvNormal(Fill(ā, n_obs), σ_a)

    x ~ MvBinomialLogit(n, a)
end

@trappmartin thank you so much for your help so far. However I can’t replicate your success so far! When I try with default settings, I get a warning, a long pause, followed by a progress bar that predicts 24 hours

this is the warning:

lbeta(x::Real, w::Real)` is deprecated, use `(logabsbeta(x, w))[1]` instead.
│   caller = logpdf_binomial_logit(::Int64, [etc]

If i set Turing.setadbackend(:reverse_diff), i get

Not implemented: convert tracked Tracker.TrackedReal{Float64} to tracked Float64
error(::String) at error.jl:33
convert(::Type{Tracker.TrackedReal{Float64}}, ::Tracker.TrackedReal{Tracker.TrackedReal{Float64}}) at real.jl:39

Could it be the versions of the packages involved?

Yes, you should use reverse mode AD. Forward mode is slow if you have many parameters to infer.

I’ll check later. I’m not on my computer atm.

Yes! Please open a PR.

1 Like

I have also experienced poor performance with reverse diff. Would you mind sharing a full working version of your model once the issues are resolved?

@Christopher_Fisher Certainly! In fact the various incarnations of this model are all here in the same github repo. Here is the latest one using FillArrays

1 Like

Hi, here is an implementation that works fine on #master and should work on the latest release too. I had to remove the FillArrays bit again, don’t know why this made trouble and worked in the first place.

@mohamed82008 I think this should work with FillArrays too? Not sure why this is problematic.

using Turing
using StatsFuns

ns = fill(400, 300)

# variation in probabilities
ps = logistic.(rand(Normal(-2,0.8), 300))

xs = rand.(Binomial.(ns, ps))

struct MvBinomialLogit{T<:AbstractVector{<:Real}, I<:AbstractVector{<:Int}} <: DiscreteMultivariateDistribution
    n::I
    logitp::T
end

Distributions.length(d::MvBinomialLogit) = length(d.n)

function Distributions.logpdf(d::MvBinomialLogit, ks::AbstractVector{<:Int})
    return sum(Turing.logpdf_binomial_logit.(d.n, d.logitp, ks))
end

@model normal_logit(n, x) = begin

    n_obs = length(n)

    ## hyperparameters for varying effectgs
    b ~ Normal(0, 0.5)
    σ_a ~ Truncated(Exponential(3), 0, Inf)

    # varying intercepts for each observation
    a ~ MvNormal(fill(b, n_obs), σ_a)

    x ~ MvBinomialLogit(n, a)
end

Turing.setadbackend(:reverse_diff)
normal_logit_samples = sample(normal_logit(ns, xs), NUTS(1000, 0.8), 2000);

Not that I had to increase the number of adaptation iterations to prevent NUTS from diverging after the adaptation. I guess 1000 is unnecessary and less would work too, I didn’t play around much with it.

Please let me know if this works for you.

How long did it take to run on your system? After running about 2%, the ETA is around 11.5 hours.

It took a few minutes. Hours sounds strange.

Could you please post your setup? Are you at master?

I am using Turing v0.7.1 because Zygote is still causing problems for me.


(v1.2) pkg> st Turing
    Status `~/.julia/environments/v1.2/Project.toml`
  [0bf59076] AdvancedHMC v0.2.7
  [31c24e10] Distributions v0.21.6
  [c7f686f2] MCMCChains v0.3.14
  [276daf66] SpecialFunctions v0.8.0
  [4c63d2b9] StatsFuns v0.9.0
  [fce5fe82] Turing v0.7.1
  [e88e6eb3] Zygote v0.4.1

I was able to switch to master without encountering a problem with Zygote and the model did run in about one minute. I’m guessing that this is still much slower than Stan, but much better than 11.5 hours. What would cause a nearly 700 fold improvement?

I’m not 100% sure, but it sounds like the gradient computation falls back to compute each gradient independently instead of vectorising the computation. But this should actually not happen anymore.