I’m using `Optim.jl`

to solve an unconstrained minimization problem.

In this particular problem I have a *black-box* function, which can take a long time on a single function evaluation. I don’t have access to gradient information, and even though I have tried to use automatic differentiation, there are some parts of the code that the differentiator cannot handle and throws some errors.

Nevertheless, I’m using the L-BFGS implementation from `Optim.jl`

with the finite differences approximation for the gradient information, and it seems to work fine. Unfortunately, I cannot show the full code here because it is not my property, but I can at least show a snippet of the code that does the optimization call

```
closure(x) = sqerror_model(x; phi=phi, rho=rho, model=model) # The objective function, a black-box
initial_x = randn(dim)
optim_res = Optim.optimize(
closure, initial_x, Optim.LBFGS(; m=15), Optim.Options(iterations=200, show_trace=true)
)
```

With this snippet I’m trying to point out that there are 200 iterations and that I want to see the trace for each iteration. I also changed to `m`

parameter for the optimization method to get a little bit more precision.

What I get in return is the following

Here, the `dimension=3001`

is the size of my parameter vector. This is the expected length of my minimizer.

I have some questions from this.

- What does the gradient norm equal to zero mean?
- I specified 200 iterations on the optimization call, why do I only see one iteration, namely the zeroth iteration?
- Why do I see zero on all the convergence measures? And also, what does the NaN mean in this convergence measures?
- At the end, it says that only one call to both the function and the gradient were done. Does this mean that only one iteration was done?