DiffEqParamEstim.jl and multiple_shooting_objective

I am reading through the tutorial of DiffEqParamEstim.jl (Optimization-Based ODE Parameter Estimation · DiffEqParamEstim.jl) in the section of multiple shooting method.
The code put together reads the following

using DifferentialEquations
using DiffEqParamEstim
using BlackBoxOptim

function ms_f(du,u,p,t)
    du[1] = dx = p[1]*u[1] - p[2]*u[1]*u[2]
    du[2] = dy = -3*u[2] + u[1]*u[2]
end
ms_u0 = [1.0;1.0]
tspan = (0.0,10.0)
ms_p = [1.5,1.0]
ms_prob = ODEProblem(ms_f,ms_u0,tspan,ms_p)
t = collect(range(0,stop=10,length=200))
sol = solve(ms_prob,Tsit5(),saveat=t,abstol=1e-12,reltol=1e-12)
data = Array(sol)
bound = Tuple{Float64, Float64}[(0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),(0, 10),(0, 10)]
  
  
ms_obj = multiple_shooting_objective(ms_prob,Tsit5(),L2Loss(t,data);discontinuity_weight=1.0,abstol=1e-12,reltol=1e-12)

result = bboptimize(ms_obj;SearchRange = bound, MaxSteps = 3*1e4)
ms_p_estimated = result.archive_output.best_candidate[end-1:end]    #parameter estimated are last
ms_ic_estimated = result.archive_output.best_candidate[1:end-2]     #initial conditions over steps 

And that works like a charm. However I haven’t understood what is the meaning of bound, especially what it reprents.
Why is it long 18 (tuples)?
when looking at the results, what are the other values other than the last 2, that are the parameters we are looking for? the docs says they are the ic at different steps, but why are they 16? the data given to interpolate are of length 200 so where (in time) are those ic used?

I found another topic on this [DifferentailEquations.jl error on multiple_shooting_objective], but as often happens the two links suggested in the solution either are broken or point to (perhaps different from what they were at the time) useless pages.

Thanks in advance

It’s the parameter bound. The first twoare the real parameters, the rest are due to the inflection points. Then you have to ignore those in the output.

I think it’s just easier to upgrade to the new multiple shooting implementation of DiffEqFlux.jl:

https://diffeqflux.sciml.ai/dev/examples/multiple_shooting/

Thank you for the answer. It’s a pity that some packages are just left behind by other that superimpose to them. But ok

I mean, you can still use DiffEqParamEstim, but as you say the interface here is a bit less intuitive and that’s not necessarily solvable given its interface.

I tried the example that you linked. However I had to rely on the stable-release one since the one on dev (in the link) doesn’t seems to work for me.

Anyhow, after taking a look a the examples for MultipleShooting in DiffEqFlux I have in front of me the following three options:

  • Learn how to use DiffEqParamEstim and its interface
  • Learn how to use DiffEqFlux
  • Write the loss function that I want (basically multipleShooting) and optimze via something like GalacticOptim.jl

The last one seems the easiest. However I wanted to ask your (Julia comunity) opinion in the matter.

  • DiffEqParamEstim seems okay. However I have the fear that it will be abandoned at some point, in favor of DiffEqFlux. Moreover I wonder at the level of performance/capabilities DiffEqFlux is far superior or not.
  • DiffEqFlux seems very modern, however it adds so many layers of complexity it makes me dizzy. Do I really need neuralODE to do a simple param estimation? What are those? How do I chose the other parameters like that FastDense(2, 16, tanh),FastDense(16, 2)) gibberish? Do they affect the result? (I am not actually looking at answer to these questions now, just to make you understand what is in my mind)
  • Just doing it by hand. Seems very easy. Easy to write the loss function, easy to manipulate. Just write it and shoot at it with GalacticOptim.jl and let it run. I could write this in a couple of hours perhaps. BUT what I am missing? People who are more expert than me already wrote libraries for the task, so what I am doing must be stupid.

Anyhow, at present time I am more oriented vs option 1. or 3. DiffEqFlux.jl tutorial on MultipleShotting didn’t give me much hope.

You can just estimate without a neural network in there…

Not much really. These days most of the issues are handled automatically through GalacticOptim. DiffEqFlux has become essentially documentation for parameter estimation with GalacticOptim, with a bunch of pre-built architectures for the SciML space.

That’s a good advice!
I post here my solution/result just in case someone need it. It is just the tutorial of Multipleshooting of DiffEqFlux but without the neural stuff.

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots
using LinearAlgebra

function lotka_volterra!(du, u, p, t)
  x, y = u
  α, β, δ, γ = p
  du[1] = dx = α*x - β*x*y
  du[2] = dy = -δ*y + γ*x*y
end
# Initial condition
u0 = [1.0, 1.0]

# Simulation interval and intermediary points
tspan = (0.0, 10.0)
t_data = 1.0:1:10.0
datasize = length(t_data)
loss_multiplier_param = 100
grp_size_param = 1

# LV equation parameter. p = [α, β, δ, γ]
true_p = [1.5, 1.0, 3.0, 1.0]

# Setup the ODE problem, then solve
prob = ODEProblem(lotka_volterra!, u0, tspan, true_p)
sol = solve(prob, Tsit5(),saveat=t_data)
data = Array(sol) .+ 0.1*rand()


function loss_subintervals(sub_data, sub_guess)
    # take only first column, second columnt is next time point
	return sum(abs2, sub_data[:,1] .- sub_guess)
end

function loss(p,null_p)
	return multiple_shoot(p, data, t_data, prob, loss_subintervals, Tsit5(), grp_size_param, loss_multiplier_param)
end

callback = function (p, l, pred, predictions; doplot = true)
    display(l)

    if doplot
        # plot the original data
        plt = scatter(t_data, data[1,1:size(pred,2)], label = "data 1",markercolor=1)
        scatter!(plt,t_data, data[2,1:size(pred,2)], label = "data 2",markercolor=2)
        # plot the approximant
        plot!(plt,t_data,pred[1,:],label="fit 1",linecolor=1)
        plot!(plt,t_data,pred[2,:],label="fit 1",linecolor=2)
      
  
        display(plot(plt))
    end
    return false
end

result_ode = DiffEqFlux.sciml_train(loss, true_p,ADAM(0.1),cb=callback,maxiters = 100)
# prob_opt = OptimizationProblem(loss,p_start,lb = [0.0,0.0,0.0,0.0],ub = [4.0,4.0,4.0,4.0])
# sol_opt = solve(prob_opt,NelderMead())

p_estimated = result_ode.u
prob_estimated = ODEProblem(lotka_volterra!, u0, tspan, p_estimated)
plt2 = plot(solve(prob, Tsit5()),label="true")
plot!(plt2,solve(prob_estimated, Tsit5()),label="estimated")
scatter!(plt2,t_data, data[1,:], label = "data 1",markercolor=1)
scatter!(plt2,t_data, data[2,:], label = "data 2",markercolor=2)

sol_estimated = solve(prob_estimated, Tsit5(),saveat=t_data)
err = (Array(sol_estimated) - data)
sum([norm(err)^2 for i in length(t_data) ])

(notice how the loss for subintervals is defined as sum(abs2, sub_data[:,1] .- sub_guess) since the data to fit is carried in with also the next value to interpolate. )

2 Likes