How to correctly track optimization loss value with callback using particle swarm?

I’d like to track loss values in a particle swarm optimization with a callback, but I’m not sure where exactly the loss numbers being tracked in the callback are coming from.

Comparing the trace printed by the optimizer to the loss kept through the callback, the minimum callback-tracked loss does not match the minimum reported by the optimizer.

Here’s an adaptation of this example.

using ModelingToolkit, Optimization, OptimizationOptimJL

@variables x y
@parameters a b

loss = (a - x)^2 + b * (y - x^2)^2

loss_list = []
cb = function(x,loss)
    append!(loss_list,loss)
    false
end

@named sys = OptimizationSystem(loss,[x,y],[a,b])

guess = [
    x=>rand(1e-3 : 1e-3 : 1e1)
    y=>rand(1e-3 : 1e-3 : 1e1)
]
p = [
    a => 6.0
    b => 7.0
]

lower = [  0.0,   0.0]
upper = [100.0, 100.0]

prob = OptimizationProblem(sys, guess, p, lb=lower, ub=upper)
# prob2 = OptimizationProblem(sys,u0,p,grad=true,hess=true)

swarm_population = 100
max_iters = 100
sol = solve(prob,ParticleSwarm(n_particles=swarm_population),maxiters=max_iters,callback=cb, show_trace=true)
# sol2 = solve(prob2,Newton(),show_trace=true,callback=cb)

(opt_loss_val, opt_loss_idx) = findmin(loss_list)

println("Min loss in callback: $(opt_loss_val), found in iteration $(opt_loss_idx)")

...
    98     5.241673e-05              NaN
 * time: 0.34200000762939453
 * x: [6.007239919639608, 36.086924970333264]
    99     2.757162e-05              NaN
 * time: 0.3430001735687256
 * x: [6.0046714866642885, 36.056985898084605]
   100     2.757162e-05              NaN
 * time: 0.3450000286102295
 * x: [6.0046714866642885, 36.056985898084605]
Min loss in callback: 5.236272445010206e-5, found in iteration 99

Hi Matthew, I would theorise that the discrepancy between the evaluated objective (loss) value and the returned callback value being appended to loss_list is due to the specifics of the ParticleSwarm implementation, where the meaning of the objective at each iterate has something to do with the best_score of the swarm.

You can modify the callback to evaluate the loss directly like:

# Here I turned a and b into Float64 constants to simplify: 
a0 = 6.0; b0 = 7.0;
loss = (a0 - x)^2 + b0 * (y - x^2)^2
loss_func = eval(build_function(loss, [x,y]))

loss_list = []

function cb(x,args...)
    loss_value = loss_func(x)
    append!(loss_list,loss_value)
    return false
end
1 Like