Maximum_a_posteriori not honoring abstol and reltol? Turing.jl

So, my student has a problem where he’s just been letting maximum_a_priori run for as long as it needs… which is taking like 2000+ seconds… I suggested to try abstol or reltol to make sure it wasn’t spending all its time on refining the last representable decimal place… Sure enough, using

maxtime=200:
203.660970 seconds (10.85 M allocations: 6.851 GiB, 0.69% gc time)
ModeResult with maximized lp of -22226.09

maxtime=100 we get:

103.657475 seconds (7.72 M allocations: 5.732 GiB, 0.38% gc time)
ModeResult with maximized lp of -22228.50

60 seconds:

63.676032 seconds (6.35 M allocations: 5.243 GiB, 0.57% gc time)
ModeResult with maximized lp of -22237.03

Obviously in these problems running for more than 100 ish seconds has highly diminishing returns to improving the log_probability_density, but there are many of these problems on different data sets and rather than guessing a sufficient time, it’d be better to say, stop when the LP doesn’t improve by more than say 0.5 or something (ie. abstol=0.5)

but when I set abstol it ALWAYS runs to the end of the maxtime it never stops early. Even with abstol=50 or more. This is using LBFGS() algorithm. same with reltol.

Are abstol and reltol not honored? Or maybe just when using this algorithm? or what?

Thought this might be a known issue so someone would just pipe up, but apparently not?

Anyway I’ll work on a MWE and post it later today, we can see if in fact it’s a problem or just something specific to our code.


using Turing,Distributions,OptimizationOptimJL,PDMats


@model function mwe(mat)
    a ~ MvNormal(fill(0.0,20),mat)
end

covmat = PDiagMat(rand(Uniform(.1,10),20))

initval = fill(1.0,20)

vec = maximum_a_posteriori(mwe(covmat),Optim.LBFGS(); initial_params=initval,maxiters=50)
vec2 = maximum_a_posteriori(mwe(covmat),Optim.LBFGS(); initial_params=initval,abstol=12.0,maxiters=50)

norm(vec.values) |>display
norm(vec2.values) |> display

lpmax = logpdf(MvNormal(fill(0.0,20),covmat),fill(0.0,20))

diff = lpmax - vec.lp
diff2 = lpmax - vec2.lp

Ok, so doing this, the second optimization does come to an earlier stop with a larger error in the lp value. So evidently abstol DID work. So, now I’ll show this to my student and see if there’s something we can do to make abstol work for us as well.

I’ll come back and report what we find.

1 Like