Loglogistic function robustness in Turing.jl

I realized in a Turing.jl model that the log-logistic function was more robust (less rejection) when written as:

x^β/(α^β + x^β)

rather than:

1/(1+(x/α)^-β)

What are the reason? How could I check the robustness of such mathematical functions? Because it may help me to better write some other part of the model.

1 Like

A more detailed answer might require more information, but I suspect that certain reparameterizations might be more efficient.

Thank you. I had a look to the link and made several implementation.

loglogistic(x, α, β) = 1 /(1 +(x/α)^(-β))
loglogistic2(x, α, β) = x^β /(α^β + x^β)
log1p_exp(x) = log1p(exp(x))
loglogistic_lcdf(x, α, β) = -log1p_exp(-β * (log(x) - log(α)))
loglogistic_lcdf2(x, α, β) =  β*log(x) - β*log(α) - log1p_exp(β*log(x)-β*log(α))
loglogistic3(x, α, β) = exp(loglogistic_lpdf(x,α,β))
loglogistic4(x, α, β) = exp(loglogistic_lcdf2(x,α,β))

All return the same (approx at 10^-14)

julia> x = -0:0.1:10
0.0:0.1:10.0
julia> using Test
julia> @test loglogistic.(x,1,2) ≈ loglogistic2.(x,1,2) atol=1e-14
Test Passed
julia> @test loglogistic.(x,1,2) ≈ loglogistic3.(x,1,2) atol=1e-14
Test Passed
julia> @test loglogistic.(x,1,2) ≈ loglogistic4.(x,1,2) atol=1e-14
Test Passed

and speed is better for x^β /(α^β + x^β):

julia> @btime loglogistic.(x,1,2);
  8.500 μs (3 allocations: 1.03 KiB)
julia> @btime loglogistic2.(x,1,2);
  722.908 ns (3 allocations: 1.03 KiB)
julia> @btime loglogistic3.(x,1,2);
  8.766 μs (3 allocations: 1.03 KiB)
julia> @btime loglogistic4.(x,1,2);
  8.733 μs (3 allocations: 1.03 KiB)

I feel like loglogistic2is the best in term of speed and robustness, and exponential of loglogistic_lcdf seems to be the best to use from Bayesian blog posts…

Floating-point arithmetic might be part of the reason, because although both formulas are strictly equivalent from a mathematical viewpoint (with (x,\alpha,\beta)\in\mathbb{R}^3), they don’t behave in the same way when computed with floats.

A tool that I like to quickly investigate such issues is herbie. In this case, it doesn’t seem to be able to propose useful ways to more accurately rewrite your expression, but at least it allows to quickly assess the accuracy of the expressions you input:

loglogistic : error in bits as a function of \beta. Errors are not great in the \beta\in[-1,1] region, but what might be even more worrying is the fact that there are a number of points with very large errors (50 to 60 inaccurate bits in double precision, where 64 would mean that the result is complete garbage) scattered everywhere.

loglogistic2 : this is much cleaner, with large errors only when \beta\approx1 (and probably for very specific values of x and \alpha).

I’m not sure what the algorithms in Turing.jl do exactly, but depending on the (x,\alpha,\beta) values for which the loglogistic function is evaluated, this might account for large differences in the overall stability of the algorithm.

In any case, I’d be very interested if you could provide an example exhibiting such differences in stability.

Fantastic. Thank you François.

Here is a MWE (well not that minimal I tried my best). Second model is almost always rejected.

using Turing
using DifferentialEquations

time=0.0:4.0
exposure=fill(0.0,5)
survival=[20,19,19,19,18]

loglogistic1(x, α, β) = x^β /(α^β + x^β)
loglogistic2(x,α, β) = 1/(1+(x/α)^(-β))

Turing.@model MWE_ll1(time, exposure, survival) = begin
    # 1. SET PRIORS
    kd_log10 ~ Normal(-1.38, 1.11)
    hb_log10 ~ Normal(-1.88,0.86)
    α_log10 ~ Normal(1.23, 0.16)
    β_log10 ~ Uniform(-2.0, 2.0) 
    kd = 10 ^kd_log10
    hb = 10 ^hb_log10
    α = 10 ^α_log10
    β = 10 ^β_log10
    # 2. MECHANISTIC MODELS
    prob = ODEProblem(odeTK, eltype(kd).([0.0]),(minimum(time), maximum(time)),[time, exposure, kd])
    _saveat = time === nothing ? Float64[] : time
    sol_tmp = solve(prob; saveat = _saveat)
    Dmax_tmp = accumulate(max, sol_tmp.u)
    N = length(sol_tmp)
    pSurv = Vector{Any}(undef, N)
    pSurv[1] = 1.0
    for i in 2:N
        pSurv[i] = exp(-hb*time[i])*(1-loglogistic1(Dmax_tmp[i][1], α,β))
    end
    # 3. INFERENCE
    ratioSurv = Vector{Real}(undef, N)
    ratioSurv[1] = 1.0
    for i in 2:N
        ratioSurv[i] = pSurv[i] / pSurv[i-1]
        survival[i] ~ Binomial(survival[i-1], ratioSurv[i])
    end
end

model_ll1 = MWE_ll1(time, exposure, survival) 
chn_ll1 = sample(model_ll1, HMC(0.01, 10), 10)

Turing.@model MWE_ll2(time, exposure, survival) = begin
    # 1. SET PRIORS
    kd_log10 ~ Normal(-1.38, 1.11)
    hb_log10 ~ Normal(-1.88,0.86)
    α_log10 ~ Normal(1.23, 0.16)
    β_log10 ~ Uniform(-2.0, 2.0) 
    kd = 10 ^kd_log10
    hb = 10 ^hb_log10
    α = 10 ^α_log10
    β = 10 ^β_log10
    # 2. MECHANISTIC MODELS
    prob = ODEProblem(odeTK, eltype(kd).([0.0]),(minimum(time), maximum(time)),[time, exposure, kd])
    _saveat = time === nothing ? Float64[] : time
    sol_tmp = solve(prob; saveat = _saveat)
    Dmax_tmp = accumulate(max, sol_tmp.u)
    N = length(sol_tmp)
    pSurv = Vector{Any}(undef, N)
    pSurv[1] = 1.0
    for i in 2:N
        pSurv[i] = exp(-hb*time[i])*(1-loglogistic2(Dmax_tmp[i][1], α,β))
    end
    # 3. INFERENCE
    ratioSurv = Vector{Real}(undef, N)
    ratioSurv[1] = 1.0
    for i in 2:N
        ratioSurv[i] = pSurv[i] / pSurv[i-1]
        survival[i] ~ Binomial(survival[i-1], ratioSurv[i])
    end
end

model_ll2 = MWE_ll2(time, exposure, survival) 
chn_ll2 = sample(model_ll2, HMC(0.01, 10), 10)