I saw there is a previous thread discussing this problem, but I don’t quite get the solution so I’m starting a new one.

“Reached maximum number of iterations while bisecting interval for ϵ.”

I’ve seen this error message before and for that problem it turned out that my gradients were wrong. However, for this new problem, I’ve triple-checked my log density and gradients and don’t find any problem with them. Moreover, the parameter values from these iterations are quite close to the truth, which suggests to me that it kinda works, but I still get this error message. I also noticed that two things may happen to the parameter values, the log density, and the gradients: they either do not change after the first few iterations, or they oscillate between two sets of values. Then after many iterations this error message pops up.

I wonder if you have suggestions on what the issue might be?

function LogDensityProblems.logdensity_and_gradient(problem::MyProblem{T}, pars) where T <: Real
@unpack ..., ∇ = problem
...
logl = ...
copyto!(∇, ...)
logl, ∇
end

But this won’t work. Instead, I need to use

function LogDensityProblems.logdensity_and_gradient(problem::MyProblem{T}, pars) where T <: Real
...
logl = ...
∇ = fill(T(0), p)
copyto!(∇, ...)
logl, ∇
end

Although it runs now, I wonder if there is a way to avoid repeatedly allocating the gradient vector?

Not really, the API expects a fresh one. This keeps the code of DynamicHMC really clean in a functional style, at a small cost (for small dimensions, you can use SVector, for large ones, the cost of AD should dominate except for trivial problems).

That said, this was not explicitly documented, so I did

To see how much speed-up I could get by avoiding the repeated allocation (problem dimension p=500,000), I tried defining the gradient vector as a global variable (global \nabla = zeros(p)) and updating it in the logdensity_and_gradient function. However, the gradient doesn’t seem to be updated correctly.

Possibly, but hard to say anything concrete without an MWE.

That said, with a dimension that high GC may be the least of your worries; a dimension that high may run into issues with NUTS (except for some special cases, independence or near-normality).