Creating a loss function for sciml_train with unequal time point distribution of the ODE components

Marius_123 · January 25, 2022, 9:42pm

Hello,

i am wondering how i can build an apropriate loss function for parameter estimation with DiffEqFlux.sciml_train if i have data points from the ODE at different time points in each component. Can maybe someone help me out? Also i get a ArgumentError whenever i call sciml_train, my current version of DiffEqFlux is 1.39.0.

ChrisRackauckas · January 26, 2022, 9:49am

saveat the union and then parse it out after the solve.

Marius_123 · January 26, 2022, 9:59am

Thank you for you response! How would you parse out the soultion vector exactly in order to not come in any conflict with AD?

ChrisRackauckas · January 26, 2022, 10:02am

sol[:,[1,4,5]] etc.

Marius_123 · January 26, 2022, 10:17am

So what i am doing at the moment is kind of the same. I iterate through all components of sol, pick the indices in each component where i have my measurements (tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]) and stack all so obtained tmp_sol together with vstack to one huge 1-D vector. This is then subtracted to the stacked vector with real mesurements and the mse is calculated. However, this didn’t really work in a previous version of DiffEqFlux. Now with an upgraded version (1.39.0) i seem to not be able to run the copy paste code in the documentation, because sciml_train produces the error ArgumentError: tuple must be non-empty, so i can’t really test it with sciml_train. The “old method” that is running for me is with Flux.train! (This is shown in the video in the Optimization of ODEs article in the DiffEqFlux documentation.

ChrisRackauckas · January 26, 2022, 10:51am

Can you share code?

Marius_123 · January 26, 2022, 11:22am

using DifferentialEquations, DiffEqFlux, Plots

function lotka_volterra!(du, u, p, t)
  x, y = u
  α, β, δ, γ = p
  du[1] = dx = α*x - β*x*y
  du[2] = dy = -δ*y + γ*x*y
end

# Initial condition
u0 = [1.0, 1.0]

# Simulation interval and intermediary points
tspan = (0.0, 10.0)
tsteps = 0.0:0.1:10.0

# LV equation parameter. p = [α, β, δ, γ]
p = [1.5, 1.0, 3.0, 1.0]

# Setup the ODE problem, then solve
prob = ODEProblem(lotka_volterra!, u0, tspan, p)
sol = solve(prob, Tsit5())

println(sol[:,[1,4,5]])

# Plot the solution
using Plots
plot(sol)
savefig("LV_ode.png")

function loss(p)
  sol = solve(prob, Tsit5(), p=p, saveat = tsteps)
  loss = sum(abs2, sol.-1)
  return loss, sol
end

callback = function (p, l, pred)
  display(l)
  plt = plot(pred, ylim = (0, 6))
  display(plt)
  # Tell sciml_train to not halt the optimization. If return true, then
  # optimization stops.
  return false
end

result_ode = DiffEqFlux.sciml_train(loss, p)

ChrisRackauckas · January 26, 2022, 11:27am

That loss function looks fine but that’s not doing what you say in the title. What are you actually trying to use/do?

Marius_123 · January 26, 2022, 1:49pm

This is not the actual loss function i have in my problem, just a copy of the code that is in the documentation and not running. My actual loss function has the following shape

function data_creator(sol, noise_data)
	combined_sol_vector = []
	GLIs_noise = Array{Float64}[]
	for i in 1:15
		if i<=3
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=7
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		else
			if i == 4
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
			elseif i == 5
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
			elseif i == 6
				tmp_sol = sol[i,[4,5,6]]
				for j in 1:size(tmp_sol)[1]
					tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
				end
				push!(GLIs_noise, tmp_sol)
				combined_sol_vector = vcat(combined_sol_vector, (GLIs_noise[1]+GLIs_noise[2])./GLIs_noise[3])
			end
		end
	end


	for i in 1:15
		if i<=3
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=7
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end

	for i in 16:30
		if i <=18
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=22
			tmp_sol = sol[i,[1,2,3,5,6,7,8,9]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end

	for i in 31:45
		if i <=33
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		elseif i >=37
			tmp_sol = sol[i,[4,6]]
			for j in 1:size(tmp_sol)[1]
				tmp_sol[j] = rand(Normal(tmp_sol[j], abs(tmp_sol[j]*noise_data)), 1)[1]
			end
			combined_sol_vector = vcat(combined_sol_vector, tmp_sol)
		end
	end
	return combined_sol_vector
end


function loss(p)
  sol = solve(prob, Tsit5(), p=p, saveat = tsteps)
	sol_flatten = data_creator(sol, noise_data)
  loss = sum(abs2, sol.-real_measurements)
  return loss, sol
end

ChrisRackauckas · January 26, 2022, 1:53pm

I’m confused. That will obviously hit a mutation error. What is the code you are actually running that reproduces your error and point to the line that has the error with the full stack trace.

Marius_123 · January 26, 2022, 2:24pm

So i am sorry if i have not been clear in my explanations. In the end i have two problems.

The first problem is that the example code from the documentation is not running, the sciml_train gives me the error ArgumentError: tuple must be non-empty
first(::Tuple{}) at tuple.jl:95
_unapply(::Nothing, ::Tuple{}) at lib.jl:163
_unapply(::Tuple{Nothing}, ::Tuple{}) at lib.jl:167
_unapply(::Tuple{Tuple{Nothing}}, ::Tuple{}) at lib.jl:167
_unapply(::Tuple{NTuple{6,Nothing},Tuple{Nothing}}, ::Tuple{Nothing,Nothing,Nothing,Array{Float64,1},Array{Float64,1},Nothing}) at lib.jl:168
unapply(::Tuple{NTuple{6,Nothing},Tuple{Nothing}}, ::Tuple{Nothing,Nothing,Nothing,Array{Float64,1},Array{Float64,1},Nothing}) at lib.jl:177
(::Zygote.var"#188#189"{Zygote.var"#kw_zpullback#40"{DiffEqSensitivity.var"#adjoint_sensitivity_backpass#177"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Tsit5,InterpolatingAdjoint{0,true,Val{:central},Bool,Bool},Array{Float64,1},Array{Float64,1},Tuple{},NamedTuple{(),Tuple{}},Colon}},Tuple{NTuple{6,Nothing},Tuple{Nothing}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at lib.jl:195
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{Zygote.var"#kw_zpullback#40"{DiffEqSensitivity.var"#adjoint_sensitivity_backpass#177"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Tsit5,InterpolatingAdjoint{0,true,Val{:central},Bool,Bool},Array{Float64,1},Array{Float64,1},Tuple{},NamedTuple{(),Tuple{}},Colon}},Tuple{NTuple{6,Nothing},Tuple{Nothing}}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at adjoint.jl:67
#solve#57 at solve.jl:70 [inlined]
(::typeof(∂(#solve#57)))(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at interface2.jl:0
(::Zygote.var"#188#189"{typeof(∂(#solve#57)),Tuple{NTuple{6,Nothing},Tuple{Nothing}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at lib.jl:194
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{typeof(∂(#solve#57)),Tuple{NTuple{6,Nothing},Tuple{Nothing}}}})(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at adjoint.jl:67
solve at solve.jl:68 [inlined]
(::typeof(∂(solve##kw)))(::VectorOfArray{Float64,2,Array{Array{Float64,1},1}}) at interface2.jl:0
loss at Test_AD.jl:32 [inlined]
(::typeof(∂(loss)))(::Tuple{Float64,Nothing}) at interface2.jl:0
#73 at train.jl:86 [inlined]
#188 at lib.jl:194 [inlined]
#1709#back at adjoint.jl:67 [inlined]
OptimizationFunction at basic_problems.jl:107 [inlined]
#188 at lib.jl:194 [inlined]
(::Zygote.var"#1709#back#190"{Zygote.var"#188#189"{typeof(∂(λ)),Tuple{Tuple{Nothing,Nothing},Int64}}})(::Tuple{Float64,Nothing}) at adjoint.jl:67
#8 at solve.jl:91 [inlined]
(::typeof(∂(λ)))(::Float64) at interface2.jl:0
(::Zygote.var"#69#70"{Params,Zygote.Context,typeof(∂(λ))})(::Float64) at interface.jl:255
gradient(::Function, ::Params) at interface.jl:59
__solve(::OptimizationProblem{true,OptimizationFunction{true,GalacticOptim.AutoForwardDiff,DiffEqFlux.var"#73#78"{typeof(loss)},GalacticOptim.var"#118#134"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#116#132"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing}}},GalacticOptim.var"#122#138"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#120#136"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing}}},GalacticOptim.var"#124#140",Nothing,Nothing,Nothing},Array{Float64,1},SciMLBase.NullParameters,Nothing,Nothing,Nothing,Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}}, ::ADAM, ::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at solve.jl:90
__solve at solve.jl:66 [inlined]
__solve at solve.jl:66 [inlined]
#solve#468 at solve.jl:3 [inlined]
(::CommonSolve.var"#solve##kw")(::NamedTuple{(:maxiters,),Tuple{Int64}}, ::typeof(solve), ::OptimizationProblem{true,OptimizationFunction{true,GalacticOptim.AutoForwardDiff,DiffEqFlux.var"#73#78"{typeof(loss)},GalacticOptim.var"#118#134"{GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#78"{typeof(loss)},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing},GalacticOptim.var"#116#132"{Array{Float64,1},Int64,GalacticOptim.var"#115#131"{OptimizationFunction{true,GalacticOptim.AutoForwardDiff{nothing},DiffEqFlux.var"#73#
(First code i shared produces this error for me)

The second problem is, that if i will get this code somehow running, how can i define an appropriate loss function for my parameter estimation problem if i have measurements of the different ODE componenets at different time points. At the moment my function data_creator tries to flatten my ODE solution in order to compare it to flattened measurements. You mentioned a mutation error, how do i need to rewrite the loss function in order to avoid it?

I am sorry for asking these probably basic questions to you, but i am relatively new to all of this and currently learning a lot.

ChrisRackauckas · February 5, 2022, 12:22pm

What code are you running that is erroring?

Marius_123 · February 25, 2022, 10:50am

I managed to find the solution, it indeed worked as described by you initially! So i have an additional question regarding sciml_train, just to be sure that i understand the function correctly: In the end this is an optimizer that computes the gradient wrt. to the parameters or initial conditions of the ODE with the sensitivity method of choice. This is used in an user specified gradient based optimizer to find the optimal parameters/initial conditions of a loss function which compares the ODE output to some data. Is this correct?
Is there a way to not only return the minimizer of the loss function, but also the current gradient/sensitivity?

ChrisRackauckas · February 28, 2022, 1:52pm

Not right now. Easiest thing to do is just re-evaluate, but this is something that could be added.

Topic		Replies	Views
Help with sciml_train General Usage question , sciml	0	283	May 25, 2022
sciML_train func running but not optimizing New to Julia diffeq , flux , optimization , sciml	5	537	April 14, 2021
Neural ODE fitting really slow Machine Learning diffeq , flux , sciml	13	1949	September 11, 2020
Using GalacticOptim to optimize, cannot understand the error General Usage diffeq , optimization , sciml	5	583	November 12, 2021
Problem with parameter estimation in a epidemiological model Modelling & Simulations numerics	6	601	June 16, 2021

Creating a loss function for sciml_train with unequal time point distribution of the ODE components

Related topics