Differential Equation Parameter Estimation Using Only A Subset of Variables

alewolf · April 22, 2019, 11:22pm

Hi,

I have a DAE problem with 37 variables - I only have data for 12 of those variables -
I would like to run parameter estimation with an L2Loss function based only on the 12 variables for which I actually have data. The examples in the Differential Equations Tutorial are very clear, but they seem to assume that data is available for all variables (Lokta-Volterra, simulates data for two variables, Lorenz for 3) - and in the optimization we use as our input the problem and solver. So my question is:
Is there a direct way to specify the subset of the solution to optimize parameters?
My only thought has been to simulate data from the equations and set weight = 0 for the costfunction on all variables that don’t have data, but maybe there is a better way?

Thanks,

A

ChrisRackauckas · April 23, 2019, 3:38am

If you use the save_idxs argument, the ODESolution only has the indices you chose, and thus only calculates the loss with those.

alewolf · April 23, 2019, 4:19pm

Awesome - that’s a great solution. thanks for the idea/tip

alewolf · May 10, 2019, 5:32pm

Hi Chris -
Thanks for all the help.
My system of DAE is non-linear, so I am assuming the best pacakge to use is NLopt for parameter estimation. However, when I try to follow your tutorial I get an error about dt being set to NaN.

using Distributed
@everywhere using DifferentialEquations
using NLopt
using DiffEqParamEstim
using RecursiveArrayTools

function f(du,u,p,t)
  dx = p[1]*u[1] - u[1]*u[2]
  dy = -3*u[2] + u[1]*u[2]
end

u0 = [1.0;1.0]
tspan = (0.0,10.0)
p = [1.5]
prob = ODEProblem(f,u0,tspan,p)
sol =  DifferentialEquations.solve(prob,Tsit5())

t = collect(range(0,stop=10,length=200))
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)

obj = build_loss_objective(prob,Tsit5(),L2Loss(t,data),maxiters=10000)


opt = Opt(:GN_ESCH, 1)
min_objective!(opt, obj)
lower_bounds!(opt,[0.0])
upper_bounds!(opt,[5.0])
xtol_rel!(opt,1e-3)
maxeval!(opt, 100000)
(minf,minx,ret) = NLopt.optimize(opt,[1.3])

The repeated error is:

Warning: Automatic dt set the starting dt as NaN, causing instability.
└ @ OrdinaryDiffEq /opt/julia/packages/OrdinaryDiffEq/KqOpn/src/solve.jl:376
┌ Warning: Instability detected. Aborting
└ @ DiffEqBase /opt/julia/packages/DiffEqBase/3gIPB/src/integrator_interface.jl:162

and the result is

(0.037901689333110125, [1.3], :MAXEVAL_REACHED)

Any idea why will this happen?
I also tried to use the JuMP warpper, but apparemty JuMP 0.19 is not compatibel with NLopt.

thanks,

A

ChrisRackauckas · May 10, 2019, 9:13pm

Bad parameters can cause a divergence, which gives an Inf loss, and the optimizer will keep going. So that repeated error isn’t harmful. Increase the number of evals to keep optimizing more.

alewolf · May 10, 2019, 9:19pm

Thanks, I’ll try that -
A

alewolf · May 10, 2019, 9:43pm

The other question in the same general subject is if I have a DAE system with callbacks
would something in this general terms work?

# Example Callbacks
condition(u,t,integrator) = t in ([21, 42, 63, 84])

function affect!(integrator)
    integrator.u[1] += Dose
    integrator.u[11] = (4*Dose+integrator.u[11]*(integrator.u[1]+integrator.u[2]+(integrator.u[3]+
                integrator.u[4]+integrator.u[5])*integrator.u[16]))/(Dose+(integrator.u[1]+
            integrator.u[2]+(integrator.u[3]+integrator.u[4]+integrator.u[5])*integrator.u[16]))
end



condition1(u,t,integrator) = u[19] < 8
affect1!(integrator) = terminate!(integrator)
  
cb = DiscreteCallback(condition,affect!)
cb1 = DiscreteCallback(condition1, affect1!)
cbset = CallbackSet(cb1,cb)



# Example model
#p=is my parameters a vcetor of 36 parameters
MM = zeros(21,21)
[MM[i,i] = 1 for i in 1:15];
t = 0.0:1.0:100
tspan = (0.0,100)
ODESystem = ODEFunction(PKPD4;mass_matrix=MM)
model_ode(p_) = ODEProblem(ODESystem, IC, tspan,p_) 
solve_model(mp_) = DifferentialEquations.solve(model_ode(mp_), Rodas5(),saveat=1, callback=cbset, tstops=[21,42,63,84])
mock_data = Array(solve_model(p))


### Building loss function (here is where I am most confused)
###Is it here where the callbacks would fit in the loss_objective function?
### I am having trouble figuring this out

loss_objective(mp_, dat) = build_loss_objective(model_ode(mp_), Rodas5(), callback=cbset, tstops=[21,42,63,84], L2Loss(t,dat),maxiters=100000)
juobj(args...) = loss_objective(args, mock_data)(args)



# JuMP model to get parameters
### What algorithm do you recommend for large non-linear problems? I am using :GN_ISRES, but maybe there are better options?

jumodel = Model()
JuMP.register(jumodel, :juobj, 36, juobj, autodiff=true)
@variables jumodel begin
    #p1 to 36
end
@NLobjective(jumodel, Min, juobj(CL_adc, CLD_adc, V1_adc, V2_adc, Kdec, e_adc, P_adc, D_adc,
    Kon_adc, Koff_adc, Agtot, Kint_ag, Kdeg, KDiff_drug_Tumor, Kout_drug,
    Kon_tub, Tub, Koff_tub, e_drug, P_drug, D_drug,
    CL_drug, CLD_drug, V1_drug, V2_drug, 
    Vmax, τ, Kgex, Kglin, ψ, KKillmax, KC50, R_cap,R_kro, c, RCell))
setsolver(jumodel, NLoptSolver(algorithm=:GN_ISRES))

From what you see in this sort of pseudo code - Does it make sense to you? I’ve been trying to test it, but it takes long long time to run, so I am honestly not sure if my code is failing or its just slow.
I did not use the callbacks in my test to make things simpler on first pass.
The code structure seems to make sense (at least to me), but I am concerned I am not inputing the mass matrices or callbacks properly into the loss function.

Thanks,

A

ChrisRackauckas · May 11, 2019, 8:53am

Yes it makes sense. However, global optimizers are quite inefficient in how they search the space, so while it will give you good parameters, it’s going to have to solve that DAE a few hundred thousand times

alewolf · May 11, 2019, 1:42pm

I see-

Do you have any recommendation to do any other way? My fear is that if I use a non-global optimizer to narrow the parameter space all I’ll achieve will be to find a local minima. I read in your tutorial that using the two-stage solution method, before the build loss function can work to save time. Will this method still get me close to optimal parameters with non linear solver, or will just stop at local?

Also how do you pick the most appropriate algorithm for NLopt? I have gone through the NLOpt Julia docs as well as the NLopt proper, but I can’t find recommendations on when to choose different algorithm (except for yours recommending :GN_IRES, as the most robust to initial conditions, but this also seems to be the slower. Are there algorithms more appropriate to let’s say stiff problems vs more shallow ones?

Thanks so much!

A

ChrisRackauckas · May 11, 2019, 6:24pm

That method is specific for ODEs though. A DAE one would be good to add.

Global optimization is just a slow process. You can keep increasing the speed of your DAE, decrease the spot where you declare divergence to drop bad parameters quicker, put a smaller bounding box on the possible parameters, etc to increase the speed. There’s tons that you can do for a specific problem but it does require work to find out what is best.

alewolf · May 11, 2019, 6:51pm

What do you mean the spot where you declare divergence? Would that be the tolerance?
How do you adjust something like that?

I’ve been trying to bound the parameters, but 36 is still lots.

Thanks!

A

ChrisRackauckas · May 11, 2019, 6:58pm

You might want to look into parallelizing this search somehow.

The unstable_check argument lets you give a function for how the divergence is checked for. Normally it looks for Inf or NaN, and then that warning is given and those parameters are rejected. But if you know your system shouldn’t have concentrations higher than 20,000 for example, you can make a check function that looks for that and shave off some time steps.

alewolf · May 11, 2019, 7:49pm

So something g like this?

function checkconc(u,integrator)
   If integrator.u[1]>20000
       return true
   end
end

And then calls this function

unstable_check=checkconc

Inside the odeProblem definition or when we are actually solving the problem?

Thanks,

A

ChrisRackauckas · May 11, 2019, 8:09pm

You have to use the documented form: http://docs.juliadiffeq.org/latest/basics/common_solver_opts.html#Miscellaneous-1

alewolf · May 11, 2019, 10:40pm

Ok
Thanks for pointing me to the right place

A

alewolf · May 15, 2019, 8:55pm

Hi Chris,
I have this working - but I have noticed one last issue that I amnot sure how to incorporate to my loose function.
I am using L2 which i think is pretty standard to find the parameters that minimize the differences between my data and my DAE solution.
However, while the scale of u1 is in untis or u16 in e-3, u19 is on the scale of e5 - I assume this makes the parameters to preferentially fit the data of e19 to minimize L2 - neglecting the fit of u1 or u16
Do you know of a way to incorporate a scaling factor that scale by the maximum of each u in L2 or a loss function that might be a better fit for this clearly different scales?

Thanks

ale

ChrisRackauckas · May 16, 2019, 12:45am

you can add weights to the L2, or define some more in depth cost function which weights terms by the maximum of u.

alewolf · May 16, 2019, 6:01pm

So something like this?

w=fill(0.0,21)
for i in 1:21
    w[i]=maximum(mock_data[i,:])
end
w=sqrt.(1 ./ w)

loss_objective(mp_, dat) = build_loss_objective(model_ode(mp_), Rodas5(), L2Loss(t,dat;data_weight=w),maxiters=100000)
juobj(args...) = loss_objective(args, mock_data)(args)

Thanks,

A

ChrisRackauckas · May 18, 2019, 11:49am

Yes

alewolf · May 20, 2019, 3:44pm

One, I think last question.
My variables are not necessarily measured at the same time -
i may have u1 measured at 3,6 and 9 days and u4 at 5,10,and 15 days for example.
Is there a way to incorporate the heterogeneous times of measurement into the L2loss function?

Thanks,

A

Topic		Replies	Views
Differential Equations and Parameter Estimation General Usage	21	3576	July 2, 2019
ODE Parameter Estimation. Simple Local Optimization Example New to Julia question , diffeq	4	745	July 9, 2020
Why do I get the same parameter after the OptimizationProblem General Usage question	23	787	March 7, 2024
Nonlinear least squares Parameter estimation with one set of data points New to Julia	17	797	July 15, 2022
Estimating some parameters of DDEs with callbacks Modelling & Simulations question	13	381	April 25, 2023

Differential Equation Parameter Estimation Using Only A Subset of Variables

Related topics