Plot Optim trace/error convergence

Hi, all.

I’m using Optim.jl to solve an optimization problem and I want to plot its trace/error convergence. However I have some issues and I couldn’t find much info on that so far.

Just to make things clearer, I want something like this:

image

So my first problem is: I’m dealing with a box-constrained optimization problem. I’m calling the optimize function as follows

res = opt.optimize(od, lb, ub, p0, opt.Fminbox(inner_optimizer), Grad_options)

and it seems that the store_trace parameter is not available for this kind of optimization. Is there any way around this?


The other thing is, if I get rid of the box constraints I can do something along the lines of

import Optim
const opt = Optim

res = opt.optimize((x-> sum(x.^2)), [100.0, 200.0], store_trace=true)
trace = opt.trace(res)

trace_err = []
trace_time = []
for i in 1:length(trace)
    append!(trace_err, parse(Float64, split(string(trace[i]))[2]))
    append!(trace_time, parse(Float64, split(string(trace[i]))[end]))
end
plot(log10.(trace_time), log10.(trace_err/trace_err[end]))

and then I get back something close to what I want.

image

But it really feels like I’m hacking my way through this, and not in a good way. Is there any standard way of plotting the trace?

1 Like

Have you tried with Optim.Options?

2 Likes

Oh, thanks @antoine-levitt, this totally went over my head!

I’m just left looking for a better way to plot these results.

Actually, now I’m not so sure output the trace’s output.

I’ve just got this back when printing the trace:

Iter     Function value   Gradient norm 
------   --------------   --------------
     0     2.260522e-01     3.247800e-01
 * time: 1.1920928955078125e-6
     1     8.766649e-02     2.633301e-01
 * time: 0.01271200180053711
     2     1.753852e-02     1.377736e-01
 * time: 0.025365114212036133
     3     3.277411e-03     1.703786e-02
 * time: 0.03435707092285156
     4     1.347663e-03     1.195144e-02
 * time: 0.04666304588317871
     5     1.280119e-03     1.866732e-04
 * time: 0.05624198913574219
     6     1.279841e-03     7.308482e-06
 * time: 0.0635230541229248
     7     1.279841e-03     4.129017e-06
 * time: 0.07249212265014648
     0     2.816947e-03     2.614618e-04
 * time: 9.5367431640625e-7
     1     2.816918e-03     6.837274e-06
 * time: 0.0038449764251708984
     2     2.816917e-03     6.098678e-06
 * time: 0.015560150146484375
     3     2.816917e-03     2.112449e-06
 * time: 0.02004694938659668
     4     2.816917e-03     2.664522e-06
 * time: 0.023893117904663086
     0     2.818454e-03     2.366669e-06
 * time: 1.9073486328125e-6
     1     2.818454e-03     6.654087e-07
 * time: 0.004698991775512695

Why does it go back to iteration 0 after iteration 7, and then back to zero again after the second 4th iteration?

Can you show how you are using Optim.Options()?

Sure, here it goes:

Grad_options = opt.Options(x_tol=10^-6, f_tol=10^-12, iterations=10^4, store_trace=true)

It’s because you’re seeing the output of the inner trace. I have never gotten around to it, but you have to be aware that what you’re seeing is not really the objective itself for box constrained optimization, it includes a penalty term. Is your objective expensive to evaluate?

Yes, it can get a bit expensive.

But I just came across this tip using a callback function, maybe I could do something with that.

But then I guess I’d have to use some global variables to store the time and function evaluation results for each iteration, right?

For example of using callback for plotting:
TensorBoardLogger.jl has an example for Optim

https://philipvinc.github.io/TensorBoardLogger.jl/dev/examples/optim/

Which if you wanted to use TensorBoard (which is not a bad choice for this kind of thing),
you could adapt directly.

1 Like

Oh I hadn’t heard of TensorBoardLogger before and this might be what I’m looking for.
Thanks for pointing it out, I’ll give it a try! :slight_smile: