In a model for biology I used a code written in f90 that worked fine, but for greater readability by the future readers of the corresponding paper, I translated it into Julia. In my f90 code, I used minimization routines by Nelder-Mead and Powell (Powell was actually better) of the 1960-70 to determine the parameters of the model.

But the corresponding routines from your package Optim.jl disappointed me :

the Nelder-Mead routine is fast enough, but it does not check whether the output is a true minimum, so it sometimes gave incorrect results -

the Powell like minimization routines (without gradients) of Julia are at least 100 times slower than the f90 version and were too slow to be useful for me.

Here are my questions :

Why did you not include a check of the minimum property, as in early fortran Nelder-Mead codes? The Nelder-Mead code was quite popular in the past, most likely because it keeps searching - within prescribed limits - until a true minimum is found.

Why are your Powell like non gradient algorithms so slow ?

Searching for model parameters by minimizing the norm of a discrepancy between theory and experiment is a standard procedure and I suspect that other people that tried Julia have been put off by the difficulties that I encountered.

The real question is, why include Nelderâ€“Mead? Why use it? Itâ€™s an obsolete algorithm that is not guaranteed to converge, and there are plenty of other derivative-free local-optimization algorithms these days. There isnâ€™t an easy check that ensures that it is globally convergent (i.e. always converges to some local minimum, not necessarily to a global minimum) â€” there are globally convergent Nelderâ€“Mead variants, but they involve much more substantial changes IIRC.

I included it in NLopt, but really only for comparison purposes, and I recommend that people only use it for this purpose. I donâ€™t know why Optim.jl uses it by default for gradient-free problems.

What â€śPowell-likeâ€ť routine are you referring to? I donâ€™t see one in Optim.jl? (When you say â€ś100 times slowerâ€ť, do you mean that it converges in 100x more iterations, or that each iteration is 100x slower?)

Iâ€™m curious what the Nelder-Mead Fortran library you refer to is actually doing when it â€ścheck[s] whether the output is a true minimum.â€ť For nonconvex optimization, finding (and verifying) global optimality is a difficult problem. I suspect that what the code is actually doing is checking whether it is at a local minimum.

Even that is not so easy in a derivative-free algorithm. You could use finite differences, but that involves 2n more function evaluations in n dimensions, and its reliability is limited by a choice of step size.

The reason why you might want such a check in Nelderâ€“Mead is that the Nelderâ€“Mead algorithm is known to be flawed: it might not converge to a local minimum. Thatâ€™s why I say it is obsolete and should typically be avoided.

Yes sure, but the Nelder Mead result is also restricted to a standard implementation, as you mentioned above (I donâ€™t know what exactly is implemented in Optim.jl). I was just pointing that such theoretical results may not be fundamental in practice (for their exceptionality or for the actual implementations of such methods being already variants that deal with such issues).

(And by that I donâ€™t mean that the OP should be using it instead of a better method)

Yes, of course. But Nelder Mead variants with convergence guarantees also exist.

(The Nelder Mead implemented in Optim.jl is a more modern improvement, indeed. It doesnâ€™t seem to have convergence results, though, but behaves better than the classical method)

I think we should go back to this question, because it seems very likely that someone translating a code from Fortran might have done some Fortran style things that would absolutely hammer the Julia performance, like for example using global variables.