I am a very frequent user of the Nelder-Mead optimisation routine (of the excellent Optim.jl package) which I find very effective for problems with a handful of free parameters to tune. I have very little knowledge of how the algorithm works, but it seems to also do well also with problems that may be discontinuous or slightly noisy too.
In some cases, I have noticed that the algorithm gets “stuck” and I have to wait for the maximum number of iterations to be reached before I get my solution (the solution can be good or bad independently of Nelder-Mead getting stuck). Now, by “stuck” I mean that the function value does not seem to be improving (at least judging by the output trace in the terminal) and the value for √(Σ(yᵢ-ȳ)²)/n also seems to be stuck at a certain value.
In such a case, when Nelder-Mead gets “stuck”, it would be nice if it would stop the optimisation and return the result. I have tried playing around with x_tol and f_tol but Nelder-Mead still displays the same behaviour of getting “stuck”.
In order to be absolutely clear, here is an example of the output trace when getting “stuck”:
Iter Function value √(Σ(yᵢ-ȳ)²)/n
------ -------------- --------------
...
4200 -5.535884e+01 2.449928e-02
4300 -5.535884e+01 2.449928e-02
4400 -5.535884e+01 2.449928e-02
...
Does somebody have some advice perhaps on how I could tell Nelder-Mead to promptly stop the optimisation when it enters this behaviour?
Thanks very much for taking the time to read this.
It sounds like you’re up against a very similar problem to the one I had, which led me to writing my own Nelder Mead package. In my case it was down to a floating point edgecase causing the simplex to flip back and forth in an endless loop. I added a catch for this endless loop and moved on. Give it a go and let me know if it helps?
Thanks for your reply and sharing your code. Floating point edge case might make sense, I often deal with numerically sensitive objectives. I will try it out and get back, cheers.
I am aware of your paper (Iterative Method for Optimization), and it had occured to me before to use it as the basis of a new implementation, but thanks for bringing it back to my memory. After trying out @jcook’s implementation, which seems to work better on the one problem I have been recently dealing with, I am not that motivated to write something from scratch. ON the other hand, I would like to carry on using Optim (because I am used to it and ingrained in many of codes), so currently I am looking to see how I can abort the optimisation when no (or insufficient) progress has been made in the last X iterations.
Have you tried other derivative-free algorithms? I wish there was a full comparison between different optimization algos. Anyways, here is a link where the author recommends the Subplex method over Nelder Mead (@stevengj is the author of NLopt, and a Julia user who is often present around here). For looking at the internals with some Julia code you may find useful section 7.5 of Algorithms for Optimization, available here.
And btw, there is a package that aims to provide a common interface to all Julia optimizers: GalacticOptim.jl.
Thanks for your input. I did try the SPBLX algorithm in NLOpt.jl (btw amazing package, I use it all the time on other things) and this also seems to get stuck. Problem is NLOpt does not produce any output in the terminal when one of its optimisation algorithms is running, which makes it difficult for me to understand what is going on.
Thanks for pointing out GalacticOptim.jl, I was not aware of this.
My wish is to continue using Optim.jl (force of habit) though @jcook code detects the problem. After lots of experimenting, my current approach is to use a callback function in Optim.jl that keeps track of the number of iterations during which insufficient progress has been made (in terms of objective value) and abort when a certain number of iterations is reached. I will post this once it properly works in case anyone else is interested.
I might be wrong, but think the problem is that NLOpt.jl calls the routines in NLOpt which themselves do not provide any information on their internal optimisation state to the external caller.