Issues in solving Parameter Estimation Optimization with SciMLSensitivity.jl

Hello everybody,

I have already asked a question on the same topic (ODEs parameter estimation) but unfortunately I am experiencing some early user issues which I cannot understand how to solve.

I am currently using the SciMLSensitivity.jl package to solve a parameter estimation problem with a specific cost function (Not the standard L2Loss).
I have two questions on this matter:

1) When running the following code

function SIDTHEfun(du, u, p, t)
    s, i, d, t, h, e = u 
    α, δ, γ, σ, τ = p
    λ = 0.1     # coefficient Lambda (fixed)
    du[1] = -s * i * α                         # dS/dt
    du[2] = s * i * α - (γ + λ) * i            # dI/dt
    du[3] = i * γ - d * (λ + δ)                # dD/dt
    du[4] = δ * d - (σ + τ) * t                # dT/dt
    du[5] = (i + d) * λ + t * σ                # dH/dt
    du[6] = t * τ                              # dE/dt

u0 = [S_data[1]; I_data[1]; D_data[1]; T_data[1]; H_data[1]; E_data[1]]
α0 = 0.25 
γ0 = 0.12
δ0 = 0.01
σ0 = 0.02
τ0 = 0.02
p0 = [α0, γ0, δ0, σ0, τ0]

# First simulation with initial condition parameters
tspan = (1.0, N_mhe)
tsteps = collect(range(1, stop = N_mhe, length = N_mhe))
prob = ODEProblem(SIDTHEfun, u0, tspan, p0)

function simulate(p)
    newprob = remake(prob, p = p)
    solsim = solve(newprob, Tsit5(), saveat = tsteps, abstol = 1e-8, reltol = 1e-8)
    return reduce(hcat,solsim.u)

ymeas = measure_mat[:, 1:N_mhe] #available measurement for each time window

function loss(p)
    sol = simulate(p)
    loss = abs2.( (sol .- ymeas) ./ std.(eachrow(ymeas)))
    return sum(loss)

adtype = Optimization.AutoForwardDiff()
optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, p0) # Initial guess for parameter estimation
optires = Optimization.solve(optprob, PolyOpt(), maxiters = 5000)

I obtain a Instability detected. Aborting warning, after which a ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 5 and 21 appears.
I guess there is a mismatch somewhere, but after having checked, I cannot find the actual error. Both sol and ymeas variables in the cost function are 6x21 matrices, while std.(eachrow(ymeas)) it’s a 6x1 column vector.

2) I want to include (together with the parameters) as optimization variables the initial condition of the set of the equation. Do you know if there are any example on this topic? I cannot find anythig unfortunately (or maybe i have not searched well enough :pensive:) .

Thanks in advance for the help !!

Are the initial conditions actually stable? Can you share your validation code?

There are quite a few documentation examples which do that. there’s nothing special to do here: just remake u0 and p.

Hello Chris, thanks for your response.
I should have fixed the first point by loading the package

using OrdinaryDiffEq

instead of

using DifferentialEquations

However, I do not understand how this could affect the optimization.
Moreover, the optimized parameters resulting are way off the initial guess, and simulating the results with the new parameters with:

opti_odesol = solve(remake(prob, p = optires.u), Tsit5(), saveat = tsteps)

outputs a: Retcode: Unstable which make sense, cause the parameters are way off the expecter result (compared to guessed initial condition)

p0 = [0.25, 0.12, 0.01, 0.02, 0.02] #guessed parameters
optires.u = [1632.645478148677, -4.061757175874248, 18.43235880201721, 168.85367122262738, -2573.2484650399556]

In addition to this, I found out that if I use:

adtype = Optimization.AutoForwardDiff()

Then the solver Optimization.solve(...) works, while if I use:

adtype = Optimization.AutoZygote()

Then the solver gives me the same error as before (ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 5 and 21), which I do not understand why it does this.

About the second question, thanks for the help, I will try and let you know as soon I fix the previous issues.

That just means that during the optimization some intermediate set of parameters (and initial conditions?) are making your system unstable this page Handling Divergent and Unstable Trajectories · SciMLSensitivity.jl covers how to effectively handle this

I found out that avoiding the use of:

abstol = 1e-8, reltol = 1e-8

in the ODE solver improved the code, and I did not encounter any instability issues anymore.
Do you think this is plausible or am I making errors in some other parts of the code?

Are you using 32-bit floats?

No, I am using float64.
Should I use 32?

Sorry Chris, but from Optimization.jl and SciMLSensitivity.jl tutorials I cannot find anything which does what I need. All the examples I see are on the Rosenbrock function, can you point out an ODE optimization problem where the initial conditions are free optimization variables and equality constraints are applied to the states u ?

So sorry to bother you, but I am very new to the language and I’m struggling a bit :frowning: :sweat:

This docs page shows how to do equality constraints. There’s literally no difference when doing it on an ODE optimization. What did you try?

Hi Chris,

Thanks for the help. Rn I am trying to make my MHE code work. I want to impose a constraint on the states of my ODE system.
I will give a quick background on the model: I have six states

s, i, d, t, h, e = u

which in every point of the ODE solution must be under the constraint:

s + i + d + t + h + e == 1 # for every t step in time

My issue is that I don’t think I can set the constraint as:

cons(res, u, p) = (res .= [ u[1] + u[2] + u[3] + u[4] + u[5] + u[6] ])

Because u is not an optimization variable, thus it does not make sense to me. I tried to impose that:

cons(res, u, p) = (res .= [ sol[1,:] + sol[2,:] + sol[3,:] + sol[4,:] + sol[5,:] + sol[6,:] ])

where sol is the 6xN matrix of the ODE results, given a set of parameters and initial conditions (see the original code), and imposing in the optimization part:

cons = [ones(Int, 1, N)], ucons = [ones(Int, 1, N)]

where N is the number of the solution steps, but I receive errors and it is clearly not the way to do this.

If you want to give a look at the code I am currently working with, I link here my GitHub repository.

That’s not an optimization constraint. You have a DAE. Just impose that directly in the form of the equations?

Thanks for the help Chris. I checked out the system better and the diff. equations sum to zero, so I should not need to add this equation to ensure that the states must sum to 1.
I have many other questions on how to correctly impose the cost function for my parameter estimation code but I think I’ll create another post.
Thank you so much!!