Negative loss when doing parameter estimation of ODEs

macquarrielucas · October 20, 2025, 10:09am

When doing parameter estimation of ODEs using the optim package I’m seeing negative losses returned by the optimizer.

I set up a simple example which reproduces it, and I can’t see anything strange I’m doing that might be causing this. Is this intended behavior? I cannot find anything about negative losses in the documentation.

A problem I’m assuming is related is that after enough iterations the loss will suddenly jump up to a higher value and stay there. Example below

exmaple_loss

Any advice would be appreciated

using Logging
using DifferentialEquations, Optimization, OptimizationOptimJL, SciMLSensitivity
using Plots

# This is a minimal example of a problem I'm having where in doing parameter 
# estimation the optimizer returns negative loss values which should be impossible.

function test_ode!(du, u, p, t)
    ϵ,μ=p 
    S,E= u 
    #dS/dt
    du[1] = -μ*S +ϵ*E 
    #dE/dt
    du[2] = μ*S - ϵ*E 
end
function estimate_parameters()
    # Data for comparison
    E_data = [836154,776521,769818, 743232, 728789]

    u0_initial = [3_100_000.0,500_000.0]
    u0_bounds =[(3_000_000,4_000_000),(100_000,1_000_000),]
    params_initial = [0.1,  0.05]
    params_bounds = [(0,1),(0,1)]
    p_bounds = vcat(u0_bounds,params_bounds)
    tspan = (0.0, 70.0)
    prob = ODEProblem(test_ode!, u0_initial, tspan, params_initial)

    function loss_function(p, prob)
        u0_current = vcat(p[1:length(u0_initial)],0)
        params_current = p[length(u0_initial)+1:end] 
        _prob = remake(prob, u0=u0_current,p=params_current)
        sol = solve(_prob)
        E_sol = sol([0,12,24,36,48])[2,:]
        return sum(abs2, E_sol .- E_data)
    end

    loss_history = Float64[]
    function callback(p,l)
        @show l
        push!(loss_history,l)
        display(plot(loss_history))
        return false
    end
    #3. Set up optimization
    opt_f = OptimizationFunction(loss_function, Optimization.AutoForwardDiff())
    p_init =  vcat(u0_initial,params_initial)
    opt_prob = OptimizationProblem(opt_f, p_init, prob; lb=first.(p_bounds), ub=last.(p_bounds))
    return solve(opt_prob,LBFGS(); callback = callback) 
end
estimate_parameters()

ChrisRackauckas · October 20, 2025, 12:08pm

Try @show p, sum(abs2, E_sol .- E_data) in the loss function and see if you can find the point that is odd, and then use that to isolate the problem.

macquarrielucas · October 20, 2025, 4:54pm

Here’s some of the output of showing p and sum(abs2, E_sol .- E_data)in the loss function. When a negative loss is detected in the call back, I also @show p and @show l

p = [3.4604209116536845e6, 830465.4297554487, 0.6664641536533832, 0.14220388808557272]
sum(abs2, E_sol .- E_data) = 1.5369843438188548e9
p = ForwardDiff.Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}, Float64, 4}[Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(3.4604209116536845e6,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(830465.4297554487,0.0,1.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.6664641536533831,0.0,0.0,1.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.14220388808557272,0.0,0.0,0.0,1.0)]
sum(abs2, E_sol .- E_data) = Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(1.5430743229291186e9,-56.35917163754493,-11391.753197848067,4.808266626457977e8,-1.2197653749892578e9)
p = [3.4604209116536845e6, 830465.4297554487, 0.6664641536533831, 0.14220388808557272]
sum(abs2, E_sol .- E_data) = 1.536984373201887e9
p = ForwardDiff.Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}, Float64, 4}[Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(3.4604209116536845e6,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(830465.4297554487,0.0,1.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.6664641536533832,0.0,0.0,1.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.14220388808557272,0.0,0.0,0.0,1.0)]
sum(abs2, E_sol .- E_data) = Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(1.5430743229312682e9,-56.35917163425438,-11391.753197842045,4.808266627301178e8,-1.2197653747655334e9)
p = [3.4604209116536845e6, 830465.4297554487, 0.6664641536533832, 0.14220388808557272]
sum(abs2, E_sol .- E_data) = 1.5369843438188548e9
┌ Warning: Negative loss encountered!
└ @ Main ~/Projects/Academics/DrugEpidemiologyProject/src/ParameterEstimation/paraesttest.jl:42
p = Optimization.OptimizationState{Vector{Float64}, Float64, Vector{Float64}, Nothing, OptimizationState{Float64, LBFGS{Optim.InverseDiagonal, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#56#57"{Vector{Int64}, Vector{Int64}, Fminbox{LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Returns{Nothing}}, Float64, Optim.var"#39#41"}, Optim.BarrierWrapper{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, Optim.BoxBarrier{Vector{Int64}, Vector{Int64}}, Float64, Float64, Vector{Float64}}}}}, Nothing}(27, [3.4604209116536845e6, 830465.4297554487, 0.6664641536533832, 0.14220388808557272], -1.1195267115541757e11, [-807.0098955840394, -720.7887640640856, 4.009274384657078e9, -1.5040220472508835e10], nothing,     27    -1.119527e+11     1.504022e+10
 * Current step size: 0.0017956504787880272
 * time: 8.30139684677124
 * g(x): [-807.0098955840394, -720.7887640640856, 4.009274384657078e9, -1.5040220472508835e10]
 * x: [3.4604209116536845e6, 830465.4297554487, 0.6664641536533832, 0.14220388808557272]
, nothing)
l = -1.1195267115541757e11
p = ForwardDiff.Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}, Float64, 4}[Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(3.4673319406535e6,1.0,0.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(830125.9104395622,0.0,1.0,0.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.6232680306192389,0.0,0.0,1.0,0.0), Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(0.13276837272742242,0.0,0.0,0.0,1.0)]
sum(abs2, E_sol .- E_data) = Dual{ForwardDiff.Tag{var"#loss_function#19"{Vector{Float64}, Vector{Int64}}, Float64}}(1.5436740138958683e9,145.3223823686585,-11914.167805943318,-8.751885839238205e8,3.828618506728241e9)
p = [3.4673319406535e6, 830125.9104395622, 0.6232680306192389, 0.13276837272742242]
sum(abs2, E_sol .- E_data) = 1.5379024280406404e9

So you can see that in the loss and parameters shown from the loss function are fine, but when looking at the loss that’s passed to the callback we see the negative value. In the OptimizationState there are also a few negative values printed e.g (27, [3.4604209116536845e6, 830465.4297554487, 0.6664641536533832, 0.14220388808557272], -1.1195267115541757e11, [-807.0098955840394, -720.7887640640856, 4.009274384657078e9, -1.5040220472508835e10], nothing, 27 -1.119527e+11 1.504022e+10 but I’m not quite sure what those numbers actually are. I took a look at the source here on line 132 which I believe is passing Optimization.OptimizationState(u = θ, objective = x[1], p = cache.p, iter = iter_count[]) to the callback function which is subsequently printed out here. Not seeing the relationship between that and the output though.

My guess is those negative values are the possible next step in the optimization but they are out of bounds? Having trouble navigating the source code to verify this though. I’ll also add that solving the same problem without bounds doesn’t give this negative loss.

macquarrielucas · October 21, 2025, 9:28am

Looking at line 132 again in Optimization.jl/src/lbfgsb.jl is even more confusing.

_loss = function (θ)
            x = cache.f(θ, cache.p)
            iter_count[] += 1
            cons_tmp .= zero(eltype(θ))
            cache.f.cons(cons_tmp, θ)
            cons_tmp[eq_inds] .= cons_tmp[eq_inds] - cache.lcons[eq_inds]
            cons_tmp[ineq_inds] .= cons_tmp[ineq_inds] .- cache.ucons[ineq_inds]
            opt_state = Optimization.OptimizationState(
                u = θ, objective = x[1], p = cache.p, iter = iter_count[])
            if cache.callback(opt_state, x...)
                error("Optimization halted by callback.")
            end
            return x[1] + sum(@. λ * cons_tmp[eq_inds] + ρ / 2 * (cons_tmp[eq_inds] .^ 2)) + 1 / (2 * ρ) * sum((max.(Ref(0.0), μ .+ (ρ .* cons_tmp[ineq_inds]))) .^ 2)
        end

The objective that is returned to the callback should just be the output of my loss_function. I don’t see anywhere it could be modified to return a negative value.

ChrisRackauckas · October 21, 2025, 4:06pm

Is this only an issue with LBFGSB?

macquarrielucas · October 21, 2025, 4:24pm

Seems like only both LBFGSB and BFGSB. Using simulated annealing or CMA-ES doesn’t give the negative loss, atleast in this example.

ChrisRackauckas · October 22, 2025, 9:10am

I recommend using OptimizationNLopt LBFGS or OptimizationOptimJL LBFGS then

ChrisRackauckas · October 22, 2025, 9:10am

Open an issue with this? It does seem related to doman handling.

Topic		Replies	Views
Error in parameter estimation of ODEs New to Julia	7	317	October 17, 2022
Parameter Estimate Problem Optimization (Mathematical) diffeq	6	1166	February 14, 2019
ODE Parameter Estimation. Simple Local Optimization Example New to Julia question , diffeq	4	766	July 9, 2020
Problem with parameter estimation in a epidemiological model Modelling & Simulations numerics	6	627	June 16, 2021
Optim and differential equations? Specific Domains	45	3528	July 5, 2019

Negative loss when doing parameter estimation of ODEs

Related topics