I have a (somewhat expensive to calculate) loss function `f(x)`

for which I can compute exact gradients using `Zygote.jl`

and wish to minimise it.

So far I have been using the LBFGS implementation in `NLopt.jl`

to do that, but there appear to be some problems with the loss functions causing NLopt to abort optimisation in some cases and return the return code `:FORCED_STOP`

(this happens in roughly one third of cases). Because `NLopt.jl`

calls some C code inside, it is very hard to debug to what exactly is going wrong, so I wanted to switch to a pure julia implementation of the LBFGS method in `Optim.jl`

. However, I found that this was about 3 times slower than the NLopt implementation:

```
opt = NLopt.Opt(:LD_LBFGS, nparams)
# loss_fun.for_nlopt(x, grad) writes the gradient at x to grad and returns f(x)
objective = loss_fun.for_nlopt
opt.min_objective = objective
opt.xtol_rel = 1e-8
opt.ftol_rel = 1e-8
opt.maxeval = 5000
@time ret_nlopt = NLopt.optimize(opt, x0)
@show opt.numevals
@show ret_nlopt[1];
@show ret_nlopt[3]
#= output
1.534183 seconds (2.96 M allocations: 975.899 MiB, 4.01% gc time)
opt.numevals = 493
ret_nlopt[1] = -16.68266898954586
ret_nlopt[3] = :FTOL_REACHED
=#
```

whereas in Optim.jl:

```
options = Optim.Options(x_reltol=1e-8, f_reltol=1e-8, iterations=5000)
method = Optim.LBFGS(linesearch=LineSearches.BackTracking())
# loss_fun.g!(grad, x) writes the gradient at x into grad
# loss_fun.fg!(grad, x) writes the gradient at x into grad and returns f(x)
once_diff = Optim.OnceDifferentiable(loss_fun, loss_fun.g!, loss_fun.fg!, x0)
@time ret_optim = Optim.optimize(od, x0, method, options)
@show ret_optim
#= output
3.373001 seconds (7.98 M allocations: 2.412 GiB, 3.59% gc time)
ret_optim = * Status: success
* Candidate solution
Final objective value: -1.677771e+01
* Found with
Algorithm: L-BFGS
* Convergence measures
|x - x'| = 3.12e-04 ≰ 0.0e+00
|x - x'|/|x'| = 7.54e-05 ≰ 1.0e-08
|f(x) - f(x')| = 1.55e-07 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 9.21e-09 ≤ 1.0e-08
|g(x)| = 3.24e-04 ≰ 1.0e-08
* Work counters
Seconds run: 3 (vs limit Inf)
Iterations: 967
f(x) calls: 998
∇f(x) calls: 968
=#
```

Is there a way to use exactly the same settings as in NLopt for Optim.jl for LBFGS?

I have already looked online for a solution and found

but none of the tips mentioned there helped.