Creating a loss function for sciml_train with unequal time point distribution of the ODE components

Hello,

i am wondering how i can build an apropriate loss function for parameter estimation with DiffEqFlux.sciml_train if i have data points from the ODE at different time points in each component. Can maybe someone help me out? Also i get a ArgumentError whenever i call sciml_train, my current version of DiffEqFlux is 1.39.0.

saveat the union and then parse it out after the solve.

Thank you for you response! How would you parse out the soultion vector exactly in order to not come in any conflict with AD?

sol[:,[1,4,5]] etc.

So what i am doing at the moment is kind of the same. I iterate through all components of sol, pick the indices in each component where i have my measurements (tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]) and stack all so obtained tmp_sol together with vstack to one huge 1-D vector. This is then subtracted to the stacked vector with real mesurements and the mse is calculated. However, this didn’t really work in a previous version of DiffEqFlux. Now with an upgraded version (1.39.0) i seem to not be able to run the copy paste code in the documentation, because sciml_train produces the error ArgumentError: tuple must be non-empty, so i can’t really test it with sciml_train. The “old method” that is running for me is with Flux.train! (This is shown in the video in the Optimization of ODEs article in the DiffEqFlux documentation.

Can you share code?

using DifferentialEquations, DiffEqFlux, Plots

function lotka_volterra!(du, u, p, t)
  x, y = u
  α, β, δ, γ = p
  du[1] = dx = α*x - β*x*y
  du[2] = dy = -δ*y + γ*x*y
end

# Initial condition
u0 = [1.0, 1.0]

# Simulation interval and intermediary points
tspan = (0.0, 10.0)
tsteps = 0.0:0.1:10.0

# LV equation parameter. p = [α, β, δ, γ]
p = [1.5, 1.0, 3.0, 1.0]

# Setup the ODE problem, then solve
prob = ODEProblem(lotka_volterra!, u0, tspan, p)
sol = solve(prob, Tsit5())

println(sol[:,[1,4,5]])

# Plot the solution
using Plots
plot(sol)
savefig("LV_ode.png")

function loss(p)
  sol = solve(prob, Tsit5(), p=p, saveat = tsteps)
  loss = sum(abs2, sol.-1)
  return loss, sol
end

callback = function (p, l, pred)
  display(l)
  plt = plot(pred, ylim = (0, 6))
  display(plt)
  # Tell sciml_train to not halt the optimization. If return true, then
  # optimization stops.
  return false
end

result_ode = DiffEqFlux.sciml_train(loss, p)

That loss function looks fine but that’s not doing what you say in the title. What are you actually trying to use/do?

This is not the actual loss function i have in my problem, just a copy of the code that is in the documentation and not running. My actual loss function has the following shape

function data_creator(sol, noise_data)
	combined_sol_vector = []
	GLIs_noise = Array{Float64}[]
	for i in 1:15
		if i<=3
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=7
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		else
			if i == 4
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
			elseif i == 5
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
			elseif i == 6
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
				combined_sol_vector = vcat(combined_sol_vector, (GLIs_noise[1]+GLIs_noise[2])./GLIs_noise[3])
			end
		end
	end


	for i in 1:15
		if i<=3
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=7
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end

	for i in 16:30
		if i <=18
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=22
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end

	for i in 31:45
		if i <=33
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=37
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end
	return combined_sol_vector
end


function loss(p)
  sol = solve(prob, Tsit5(), p=p, saveat = tsteps)
	sol_flatten = data_creator(sol, noise_data)
  loss = sum(abs2, sol.-real_measurements)
  return loss, sol
end

I’m confused. That will obviously hit a mutation error. What is the code you are actually running that reproduces your error and point to the line that has the error with the full stack trace.

So i am sorry if i have not been clear in my explanations. In the end i have two problems.

The first problem is that the example code from the documentation is not running, the sciml_train gives me the error ArgumentError: tuple must be non-empty
first(::Tuple{}) at tuple.jl:95
_unapply(::Nothing, ::Tuple{}) at lib.jl:163
_unapply(::Tuple{Nothing}, ::Tuple{}) at lib.jl:167
_unapply(::Tuple{Tuple{Nothing}}, ::Tuple{}) at lib.jl:167
_unapply(::Tuple{NTuple{6,Nothing},Tuple{Nothing}}, ::Tuple{Nothing,Nothing,Nothing,Array{Float64,1},Array{Float64,1},Nothing}) at lib.jl:168
unapply(::Tuple{NTuple{6,Nothing},Tuple{Nothing}}, ::Tuple{Nothing,Nothing,Nothing,Array{Float64,1},Array{Float64,1},Nothing}) at lib.jl:177
(::Zygote.var"#188#189"{Zygote.var"#kw_zpullback#40"{DiffEqSensitivity.var"#adjoint_sensitivity_backpass#177"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Tsit5,InterpolatingAdjoint{0,true,Val{:central},Bool,Bool},Array{Float64,1},Array{Float64,1},Tuple{},NamedTuple{(),Tuple{}},Colon}},Tuple{NTuple{6,Nothing},Tuple{Nothing}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at lib.jl:195
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{Zygote.var"#kw_zpullback#40"{DiffEqSensitivity.var"#adjoint_sensitivity_backpass#177"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Tsit5,InterpolatingAdjoint{0,true,Val{:central},Bool,Bool},Array{Float64,1},Array{Float64,1},Tuple{},NamedTuple{(),Tuple{}},Colon}},Tuple{NTuple{6,Nothing},Tuple{Nothing}}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at adjoint.jl:67
#solve#57 at solve.jl:70 [inlined]
(::typeof(∂(#solve#57)))(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at interface2.jl:0
(::Zygote.var"#188#189"{typeof(∂(#solve#57)),Tuple{NTuple{6,Nothing},Tuple{Nothing}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at lib.jl:194
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{typeof(∂(#solve#57)),Tuple{NTuple{6,Nothing},Tuple{Nothing}}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at adjoint.jl:67
solve at solve.jl:68 [inlined]
(::typeof(∂(solve##kw)))(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at interface2.jl:0
loss at Test_AD.jl:32 [inlined]
(::typeof(∂(loss)))(::Tuple{Float64,Nothing}) at interface2.jl:0
#73 at train.jl:86 [inlined]
#188 at lib.jl:194 [inlined]
#1709#back at adjoint.jl:67 [inlined]
OptimizationFunction at basic_problems.jl:107 [inlined]
#188 at lib.jl:194 [inlined]
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{typeof(∂(λ)),Tuple{Tuple{Nothing,Nothing},Int64}}})(::Tuple{Float64,Nothing}) at adjoint.jl:67
#8 at solve.jl:91 [inlined]
(::typeof(∂(λ)))(::Float64) at interface2.jl:0
(::Zygote.var"#69#70"{Params,Zygote.Context,typeof(∂(λ))})(::Float64) at interface.jl:255
gradient(::Function, ::Params) at interface.jl:59
__solve(::OptimizationProblem{true,OptimizationFunction{true,GalacticOptim.AutoForwardDiff,DiffEqFlux.var"#73#78"{typeof(loss)},GalacticOptim.var"#118#134"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#116#132"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing}}},GalacticOptim.var"#122#138"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#120#136"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing}}},GalacticOptim.var"#124#140",Nothing,Nothing,Nothing},Array{Float64,1},SciMLBase.NullParameters,Nothing,Nothing,Nothing,Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}}, ::ADAM, ::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at solve.jl:90
__solve at solve.jl:66 [inlined]
__solve at solve.jl:66 [inlined]
#solve#468 at solve.jl:3 [inlined]
(::CommonSolve.var"#solve##kw")(::NamedTuple{(:maxiters,),Tuple{Int64}}, ::typeof(solve), ::OptimizationProblem{true,OptimizationFunction{true,GalacticOptim.AutoForwardDiff,DiffEqFlux.var"#73#78"{typeof(loss)},GalacticOptim.var"#118#134"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#116#132"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#
(First code i shared produces this error for me)

The second problem is, that if i will get this code somehow running, how can i define an appropriate loss function for my parameter estimation problem if i have measurements of the different ODE componenets at different time points. At the moment my function data_creator tries to flatten my ODE solution in order to compare it to flattened measurements. You mentioned a mutation error, how do i need to rewrite the loss function in order to avoid it?

I am sorry for asking these probably basic questions to you, but i am relatively new to all of this and currently learning a lot.

What code are you running that is erroring?

I managed to find the solution, it indeed worked as described by you initially! So i have an additional question regarding sciml_train, just to be sure that i understand the function correctly: In the end this is an optimizer that computes the gradient wrt. to the parameters or initial conditions of the ODE with the sensitivity method of choice. This is used in an user specified gradient based optimizer to find the optimal parameters/initial conditions of a loss function which compares the ODE output to some data. Is this correct?
Is there a way to not only return the minimizer of the loss function, but also the current gradient/sensitivity?

Not right now. Easiest thing to do is just re-evaluate, but this is something that could be added.