Optim.jl to solve an optimization problem and I want to plot its trace/error convergence. However I have some issues and I couldn’t find much info on that so far.
Just to make things clearer, I want something like this:
So my first problem is: I’m dealing with a box-constrained optimization problem. I’m calling the
optimize function as follows
res = opt.optimize(od, lb, ub, p0, opt.Fminbox(inner_optimizer), Grad_options)
and it seems that the
store_trace parameter is not available for this kind of optimization. Is there any way around this?
The other thing is, if I get rid of the box constraints I can do something along the lines of
const opt = Optim
res = opt.optimize((x-> sum(x.^2)), [100.0, 200.0], store_trace=true)
trace = opt.trace(res)
trace_err = 
trace_time = 
for i in 1:length(trace)
append!(trace_err, parse(Float64, split(string(trace[i]))))
append!(trace_time, parse(Float64, split(string(trace[i]))[end]))
and then I get back something close to what I want.
But it really feels like I’m hacking my way through this, and not in a good way. Is there any standard way of plotting the trace?
Have you tried with
@antoine-levitt, this totally went over my head!
I’m just left looking for a better way to plot these results.
Actually, now I’m not so sure output the trace’s output.
I’ve just got this back when printing the trace:
Iter Function value Gradient norm
------ -------------- --------------
0 2.260522e-01 3.247800e-01
* time: 1.1920928955078125e-6
1 8.766649e-02 2.633301e-01
* time: 0.01271200180053711
2 1.753852e-02 1.377736e-01
* time: 0.025365114212036133
3 3.277411e-03 1.703786e-02
* time: 0.03435707092285156
4 1.347663e-03 1.195144e-02
* time: 0.04666304588317871
5 1.280119e-03 1.866732e-04
* time: 0.05624198913574219
6 1.279841e-03 7.308482e-06
* time: 0.0635230541229248
7 1.279841e-03 4.129017e-06
* time: 0.07249212265014648
0 2.816947e-03 2.614618e-04
* time: 9.5367431640625e-7
1 2.816918e-03 6.837274e-06
* time: 0.0038449764251708984
2 2.816917e-03 6.098678e-06
* time: 0.015560150146484375
3 2.816917e-03 2.112449e-06
* time: 0.02004694938659668
4 2.816917e-03 2.664522e-06
* time: 0.023893117904663086
0 2.818454e-03 2.366669e-06
* time: 1.9073486328125e-6
1 2.818454e-03 6.654087e-07
* time: 0.004698991775512695
Why does it go back to iteration 0 after iteration 7, and then back to zero again after the second 4th iteration?
Can you show how you are using
Sure, here it goes:
Grad_options = opt.Options(x_tol=10^-6, f_tol=10^-12, iterations=10^4, store_trace=true)
It’s because you’re seeing the output of the inner trace. I have never gotten around to it, but you have to be aware that what you’re seeing is not really the objective itself for box constrained optimization, it includes a penalty term. Is your objective expensive to evaluate?
Yes, it can get a bit expensive.
But I just came across this tip using a callback function, maybe I could do something with that.
This file has been truncated.
## Dealing with constant parameters
In many applications, there may be factors that are relevant to the function evaluations,
but are fixed throughout the optimization. An obvious example is using data in a
likelihood function, but it could also be parameters we wish to hold constant.
Consider a squared error loss function that depends on some data `x` and `y`,
and parameters `betas`. As far as the solver is concerned, there should only be one
input argument to the function we want to minimize, call it `sqerror`.
The problem is that we want to optimize a function `sqerror` that really depends
on three inputs, and two of them are constant throught the optimization procedure.
To do this, we need to define the variables `x` and `y`
x = [1.0, 2.0, 3.0]
y = 1.0 + 2.0 * x + [-0.3, 0.3, -0.1]
We then simply define a function in three variables
function sqerror(betas, X, Y)
err = 0.0
But then I guess I’d have to use some global variables to store the time and function evaluation results for each iteration, right?
For example of using callback for plotting:
TensorBoardLogger.jl has an example for Optim
Which if you wanted to use TensorBoard (which is not a bad choice for this kind of thing),
you could adapt directly.
Oh I hadn’t heard of TensorBoardLogger before and this might be what I’m looking for.
Thanks for pointing it out, I’ll give it a try!