DynamicHMC: Reached maximum number of iterations while bisecting interval for epsilon

kai · June 26, 2021, 8:29am

I saw there is a previous thread discussing this problem, but I don’t quite get the solution so I’m starting a new one.

“Reached maximum number of iterations while bisecting interval for ϵ.”

I’ve seen this error message before and for that problem it turned out that my gradients were wrong. However, for this new problem, I’ve triple-checked my log density and gradients and don’t find any problem with them. Moreover, the parameter values from these iterations are quite close to the truth, which suggests to me that it kinda works, but I still get this error message. I also noticed that two things may happen to the parameter values, the log density, and the gradients: they either do not change after the first few iterations, or they oscillate between two sets of values. Then after many iterations this error message pops up.

I wonder if you have suggestions on what the issue might be?

Tamas_Papp · June 26, 2021, 8:39am

It would be much easier to investigate what happens with a self-contained example I can run.

kai · June 28, 2021, 2:46am

Unfortunately much of the code is unpublished work, but I’ll try to come up with a small example.

Tamas_Papp · June 28, 2021, 4:04am

You can send it privately and I will treat it confidential.

kai · July 4, 2021, 5:28am

Thank you for offering to help and I really appreciate it.

I found the problem: I need to create a gradient vector every time the function logdensity_and_gradient is called.

Previously I defined the problem as

struct MyProblem{T <: Real}
    ...
    ∇ :: Vector{T} # gradient
end

and calculated the gradient using

function LogDensityProblems.logdensity_and_gradient(problem::MyProblem{T}, pars) where T <: Real
    @unpack ..., ∇ = problem
    ...
    logl = ...
    copyto!(∇, ...)
    logl, ∇
end

But this won’t work. Instead, I need to use

function LogDensityProblems.logdensity_and_gradient(problem::MyProblem{T}, pars) where T <: Real
    ...
    logl = ...
    ∇ = fill(T(0), p)
    copyto!(∇, ...)
    logl, ∇
end

Although it runs now, I wonder if there is a way to avoid repeatedly allocating the gradient vector?

Tamas_Papp · July 5, 2021, 7:39am

Not really, the API expects a fresh one. This keeps the code of DynamicHMC really clean in a functional style, at a small cost (for small dimensions, you can use SVector, for large ones, the cost of AD should dominate except for trivial problems).

That said, this was not explicitly documented, so I did

kai · July 5, 2021, 1:41pm

Got it. Thank you very much for the explanation.

kai · August 19, 2021, 12:06am

To see how much speed-up I could get by avoiding the repeated allocation (problem dimension p=500,000), I tried defining the gradient vector as a global variable (global \nabla = zeros(p)) and updating it in the logdensity_and_gradient function. However, the gradient doesn’t seem to be updated correctly.

Is this expected?

Tamas_Papp · August 19, 2021, 10:56am

Possibly, but hard to say anything concrete without an MWE.

That said, with a dimension that high GC may be the least of your worries; a dimension that high may run into issues with NUTS (except for some special cases, independence or near-normality).

Topic		Replies	Views
DynamicHMC: Reached maximum number of iterations General Usage	9	731	May 29, 2019
Static array multiplied by its adjoint seems to allocate Performance linearalgebra , memory-allocation , staticarrays , allocations	12	236	December 2, 2024
Repeated Function Call Leaking Memory Performance	3	604	July 29, 2018
ForwardDiff and GradientConfig memory usage Performance	10	1122	April 5, 2018
Improving performance of a number of functions for large scale gradient based optimization Performance performance , optim , memory-allocation , loops	11	831	November 5, 2021

DynamicHMC: Reached maximum number of iterations while bisecting interval for epsilon

Related topics