BFGS very slow compared to BlackBoxOptim - How to improve performance

A few suggestions:

  1. Make sure the gradient is correct. A lot of the times that I struggled with gradient-based optimisation algorithms, the gradient was wrong. So define the cost function and check that it differentiates correctly using any AD package and finite difference. There might be an AD bug, unlikely but not impossible.
  2. Try other algorithms except BFGS. If your cost function’s curvature is changing often, BFGS is likely a bad choice for an algorithm here because it tries to capture “global curvature information” in the approximate inverse Hessian which can be complete gibberish if your cost function’s curvature changes too often. GradientDescent and ConjugateGradient are 2 alternatives I would try.
  3. Benchmark your function and its gradient and check for type instabilities with Float64 inputs and ForwardDiff.Dual inputs. It’s possible that your function is type stable when run with 1 input type but not type stable when run with another input type.
  4. Consider using reverse-mode AD to define the gradient if it’s too slow. You can pass the gradient function explicitly to Optim.
  5. Loosen the tolerance as Chris suggests above and see how loose it’s allowed to be while still converging to a reasonable solution.
1 Like