Connection between BFGS and ROOT.Minuit. Stopping criteria

Dear experts on optimization problems,

I am trying to establish a connection between the common tools in High Energy Physics (HEP) and Julia ecosystem.

In HEP, the main tool for optimization is Minuit library. Nowadays, the original Fortran code is literally converted to C++ and used entirely for all problems just because it is so reliable. (here are Yggdrasil binaries thanks to @jstrube, @giordano)

On the non-HEP side, one of the most reliable and widely used optimizers seems to be BFGS, particularly in Julia in NLopt or Optim.

Actually, it might turn that the two libraries are implementing the same algorithm with different stopping conditions.

Here is a quote from [the original Minuit paper] on the implemented algorithm (https://www.sciencedirect.com/science/article/pii/0010465575900399).
image

and the stopping criteria

So,

  • Is Minuit doing BFGS or its ancestor?
  • From my experience, EDM is a good indicator of convergence. Is there anything similar in the current implementation of optimizers?
  • Is there a way to compute EDM with Julia, e.g. in Optim?

Thanks

(pin @pkofod, @andreasnoack, @anriseth from Optim/bfgs.jl)

1 Like

It’s doing something very similar to BFGS, look up variable metric or quasi-Newton methods. The stopping criterion is a test of a “hypothesis” that the gradient is zero that only makes sense for statistical problems. No this is not implemented in Optim or NLopt but you could do it with a callback I think. Many of these methods (DFP, BFGS, …) come from the econometrics/statistics literature so you will sometimes find such stopping criterions in the original papers, but they’re rarely used in general purpose software.

2 Likes

Many thanks for the reply.

What should I look for? Could you expand, please?

Yes, I would like to try something in this spirit. What do you have in mind for a callback?
EDM requires hessian that is very expensive unless it is a by-product of minimization.

the objective stopping condition looks like a great thing to have. With a numerical threshold g_tol I have the impression that sometimes it takes ages to reach 1e-8. However, it is not clear how to adjust it, i.e. if it is safe to do