Slow decrease of loss and an error during fitting an ODE to data

Hello,
I have been trying to fit one parameter of a an ODE using the experimental data. I used the following example to write the code

The problem is that the MSE loss is decreasing really really slowly even though the initial guess I have given is pretty good. The starting loss was 156.88371364301955 and even after 500 iterations the ending loss was 156.49626077372287.
And after 500 iteration the code ended with the following error which I couldn’t understand.


┌ Warning: At t=5840.613462152258, dt was forced below floating point epsilon 9.094947017729282e-13, and step error estimate = NaN. Aborting. There is either an error in your model specification or the true solution is unstable (or the true solution can not be represented in the precision of ForwardDiff.Dual{ForwardDiff.Tag{DifferentiationInterface.FixTail{var"#15#16", Tuple{SciMLBase.NullParameters}}, Float64}, Float64, 1}).
└ @ SciMLBase C:\Users\Kalath_A\.julia\packages\SciMLBase\YTOjh\src\integrator_interface.jl:623     
ERROR: AssertionError: isfinite(phi_d) && isfinite(gphi)
Stacktrace:
  [1] bisect!(ϕdϕ::LineSearches.var"#ϕdϕ#6"{…}, alphas::Vector{…}, values::Vector{…}, slopes::Vector{…}, ia::Int64, ib::Int64, phi_lim::Float64, display::Int64)
    @ LineSearches C:\Users\Kalath_A\.julia\packages\LineSearches\jgnxK\src\hagerzhang.jl:510
  [2] (::LineSearches.HagerZhang{…})(ϕ::Function, ϕdϕ::LineSearches.var"#ϕdϕ#6"{…}, c::Float64, phi_0::Float64, dphi_0::Float64)
    @ LineSearches C:\Users\Kalath_A\.julia\packages\LineSearches\jgnxK\src\hagerzhang.jl:208       
  [3] HagerZhang
    @ C:\Users\Kalath_A\.julia\packages\LineSearches\jgnxK\src\hagerzhang.jl:102 [inlined]
  [4] perform_linesearch!(state::Optim.BFGSState{…}, method::Optim.BFGS{…}, d::Optim.ManifoldObjective{…})
    @ Optim C:\Users\Kalath_A\.julia\packages\Optim\HvjCd\src\utilities\perform_linesearch.jl:58    
  [5] update_state!(d::NLSolversBase.TwiceDifferentiable{…}, state::Optim.BFGSState{…}, method::Optim.BFGS{…})
    @ Optim C:\Users\Kalath_A\.julia\packages\Optim\HvjCd\src\multivariate\solvers\first_order\bfgs.jl:139
  [6] optimize(d::NLSolversBase.TwiceDifferentiable{…}, initial_x::Vector{…}, method::Optim.BFGS{…}, options::Optim.Options{…}, state::Optim.BFGSState{…})
    @ Optim C:\Users\Kalath_A\.julia\packages\Optim\HvjCd\src\multivariate\optimize\optimize.jl:54  
  [7] optimize(d::NLSolversBase.TwiceDifferentiable{…}, initial_x::Vector{…}, method::Optim.BFGS{…}, options::Optim.Options{…})
    @ Optim C:\Users\Kalath_A\.julia\packages\Optim\HvjCd\src\multivariate\optimize\optimize.jl:36  
  [8] __solve(cache::OptimizationCache{…})
    @ OptimizationOptimJL C:\Users\Kalath_A\.julia\packages\OptimizationOptimJL\e3bUa\src\OptimizationOptimJL.jl:218
  [9] solve!(cache::OptimizationCache{…})
    @ SciMLBase C:\Users\Kalath_A\.julia\packages\SciMLBase\YTOjh\src\solve.jl:187
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
    @ SciMLBase C:\Users\Kalath_A\.julia\packages\SciMLBase\YTOjh\src\solve.jl:95
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
 [10] solve(::OptimizationProblem{…}, ::Optim.BFGS{…}; kwargs::@Kwargs{…})
    @ SciMLBase C:\Users\Kalath_A\.julia\packages\SciMLBase\YTOjh\src\solve.jl:95
 [11] __solve(::OptimizationProblem{…}, ::PolyOpt; maxiters::Int64, kwargs::@Kwargs{…})
    @ OptimizationPolyalgorithms C:\Users\Kalath_A\.julia\packages\OptimizationPolyalgorithms\5lhpw\src\OptimizationPolyalgorithms.jl:34
 [12] __solve
    @ C:\Users\Kalath_A\.julia\packages\OptimizationPolyalgorithms\5lhpw\src\OptimizationPolyalgorithms.jl:11 [inlined]        
 [13] solve(::OptimizationProblem{…}, ::PolyOpt; kwargs::@Kwargs{…})
    @ SciMLBase C:\Users\Kalath_A\.julia\packages\SciMLBase\YTOjh\src\solve.jl:98
 [14] top-level scope
    @ e:\PhD Ashima\Neural ODE\Julia\IQgen_Tmixed\Curve fitting\ode_fit_julia.jl:103
Some type information was truncated. Use `show(err)` to see complete types.

The following is my code. extract_index is a function I defined to extract data from my raw experimental data. I didn’t want to make the code lengthy here so didn’t add it in here.

using DifferentialEquations, Optimization, OptimizationPolyalgorithms, SciMLSensitivity
using ForwardDiff, Plots
using JLD2
using Statistics

#fitting conditions
Crate,Temp = "2C",0


# Loading data
data_file = load("unique_Datasets.jld2")["Datasets"]
data  = data_file["$(Crate)_T$(Temp)"] 

n1,n2 = extract_index(data["current"]) 
t,T,T∞ = data["time"][n1:n2],data["temperature"][n1:n2],data["temperature"][1]

# Plot
plot(t,T,label = "Temperature")

# Define the ODE
function ode_model!(du,u,p,t,T∞)
    T = u[1]
    C₁ = p[1]
    du[1] = -(T-T∞)/C₁
end

# Initial condition
u0 = [T[1]]

# Simulation interval
tspan = (t[1],t[end])

# Parameters guess
pguess = [1000]

# Set up ODE problem with guessed parameters
ode_model1!(du,u,p,t) = ode_model!(du,u,p,t,T∞)
prob = ODEProblem(ode_model1!,u0,tspan,pguess)

# Solve ode problem with guessed parameters
initial_sol = solve(prob,saveat = t)

# view guessed solution
plt = plot(initial_sol,label = ["T pred"])
scatter!(t,T,label = ["T"])

# Define cost function
function loss(newp)
    newprob = remake(prob,p = newp)
    sol = solve(newprob,saveat = t)
    loss = mean(abs2,sol.-T)
    return loss
end


# Callback function to monitor the optimization
function callback(state,l)
    display(l)
    newprob = remake(prob,p = state.u)
    sol = solve(newprob,saveat = t)
    plt = plot(sol,ylim = (240,360),label = ["T current pred"])
    scatter!(plt,t,T,label=["T actual"])
    display(plt)
    return false
end

# Optimize the parameters
adtype = AutoForwardDiff()
pguess = [1000.0]
optf = OptimizationFunction((x,_) -> loss(x),adtype)
optprob = OptimizationProblem(optf,pguess)
pfinal = solve(optprob,PolyOpt(),callback = callback,maxiters = 500)

It requires data set to run. Can anyone please help me with this? This should be a simple problem cooling problem to solve. But somehow takes a lot of time

Is there any way you could create a MWE that’s runnable? My first suspicion would be that a sub-optimal ODE solver is being used but a bit difficult to test without a MWE.

Or just the solver tolerance is too high.

Thank you for all your reply. I was able to debug the code and apparently the error was due to a dimension mismatch when I am calculating the loss. The loss should be loss = mean(abs2,sol[1,:].-T).

I am getting a good fit now. But during the initial iterations the loss is still decreasing slowly. In the final few iterations the loss jumps drastically and gives the solution. Is this the normal behaviour? This is how my loss function goes.

8.823401982574351
8.822957812054376
8.822513653748809
8.822069507105471
8.821625372889988
8.821181250556224
8.82073714046958
8.820293042239316
8.819848956197449
8.819404882133725
8.819404882133725
8.819533712044315
0.9350706276822296
0.18154377871977903
0.17983946840351075
0.17983946632392875
retcode: Success

I tried reducing the tolerance. My new solver settings is

Solver = Tsit5()
sol = solve(newprob,Solver,abstol = 1e-6,reltol = 1e-9,saveat = tdata)

Still the loss decreases slowly

8.83070912956643
8.83026475925042
8.82982040099848
8.829376054809883
8.828931720683078
8.828487398616993
8.828043088610425
8.827598790661725
8.827154504770306
8.826710230934976
8.826265969153756
8.825821719426086
8.825377481750598
8.824933256125755
8.824489042550749
8.82404484102433
8.823600651544869
8.823156474111531
8.82271230872314
8.82226815537842
8.821824014075686
8.821379884814268
8.820935767592546
8.820491662410072
8.820047569264565
8.820047569264565
8.820047580535817
0.9202558030635122
0.18140240144957576
0.17976875876525386
0.1797687582869255

The following is my MWE

using DifferentialEquations, Optimization, OptimizationPolyalgorithms, SciMLSensitivity
using ForwardDiff, Plots
using JLD2
using Statistics


# Define the ODE
function ode_model!(du,u,p,t,T∞)
    T = u[1]
    C₁ = p[1]
    du[1] = -(T-T∞)/C₁
end

# Generate data
tdata = 1650:1:8850
T∞ = 273.15
C₁_actual = 650.0
u0 = [315.15]
p_actual = [C₁_actual]
ode_actual(du,u,p,t) = ode_model!(du,u,p,t,T∞)
prob_actual = ODEProblem(ode_actual,u0,(tdata[1],tdata[end]),p_actual)
yactual = solve(prob_actual,saveat=tdata)

# Plotting actual values
plot(tdata,yactual[1,:],label="Actual data",xlabel="Time",ylabel="Temperature",title="Actual data")


# Parameter estimation
pguess = [1000.0]
ode_model1!(du,u,p,t) = ode_model!(du,u,p,t,T∞)
prob = ODEProblem(ode_model1!,u0,(tdata[1],tdata[end]),pguess)
initial_sol = solve(prob ,saveat = tdata)

# view guessed solution
plt = plot(tdata,initial_sol[1,:],label = ["T predicted"],color = :red)
plot!(tdata,yactual[1,:],label = ["T actual"],color = :black)

# Define cost function
function loss(newp)
    newprob = remake(prob,p = newp)
    sol = solve(newprob,saveat = tdata)
    loss = mean(abs2,sol[1,:].-yactual[1,:])
    return loss
end

function callback(state,l)
    display(l)
    newprob = remake(prob,p = state.u)
    sol = solve(newprob,saveat = tdata)
    plt = plot(sol[1,:],ylim = (240,360),label = ["T current prediction"],color = :red)
    plot!(plt,tdata,yactual[1,:],label=["T actual"],color = :black)
    display(plt)
    return false
end

adtype = AutoForwardDiff()
pguess = [1000.0]
optf = OptimizationFunction((x,_) -> loss(x),adtype)
optprob = OptimizationProblem(optf,pguess)
pfinal = solve(optprob,PolyOpt(),callback = callback,maxiters = 500)

I am getting a really good fit with this method. I just want to know more about this behaviour. So if you have any insights please do share. Thank you

It’s possible that the optimizer hit a local minimum or a relatively flat spot on the loss surface, and then “broke out of it”, to find a lower minimum. Since you’re only optimizing one parameter in this case, it’s pretty easy to visualize your loss surface, plot the loss function for a range of parameter values and you might be able to see if there’s a local minimum.