Fitting a dynamic system with an exogenous input (nonhomogenous neural ode) via DiffEqFlux

Hello,

I´m trying to fit a dynamic system (a hammerstein system, implemented below) with an exogenous input via DiffEqFlux, but actually I don´t know how to put the input excitation into the n-ode-problem. I first tried to follow the example:
https://diffeqflux.sciml.ai/dev/examples/neural_ode_sciml/

Example 3: Solving Nonhomogeneous Equations using Parameterized Functions,

but I wasn´t able to transfer it to my application. I posted below my attempt. Maybe someone could tell me what I´m doing wrong?

# intialization
using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots

# excitation of the system
no_samples = 100
sample_period = 0.1
tsteps = collect(sample_period:sample_period:(no_samples-1)*sample_period)
tspan = (Float32(tsteps[1]), Float32(tsteps[end]))
ex = t->sin(t)
u = ex.(tsteps)

# system
f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))
function hammerstein_system(u::Array{Float64})
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

# system output
y = hammerstein_system(ex.(tsteps))

# model setup
nn_model = FastChain(FastDense(1,50, sigmoid), FastDense(50, 1))

#intial parameter of the model
p_model = initial_params(nn_model)
 
function predict_neuralode(p)
  Array(prob_neuralode(u0, p))
end

function dudt(du, u, p, t)
    du[1] = nn_model(ex(t))[1]
end

prob_neuralode = NeuralODE(dudt, tspan, Tsit5(), saveat = tsteps)

prob_neuralode(0.0, p_model)

Optimal control is essentially learning an input excitation, so https://diffeqflux.sciml.ai/dev/examples/optimal_control/ is an example you might want to look at. Basically, you just pass the function, not much else to it.

Thanks. I will try it, but in my point of view the main difference is, that I would like to excitate the system with an excitation signal and in the example the signal is the optimization target. So it is generated by the optimization in contrast to that I would like to specify the signal externally and excitate the system and n-ode with the prepared signal. The reason is that exciatation signals like chirp, aprbs or multisine are useful to gather as much as possible information of the system.

Okay, I thought, that I would evaluate ex = t->sin(t) at the specific time point and so pass the value of it to the nn. I haven’t quite got behind DiffEqFlux’s user interface yet

I think, I have misunderstood your reply, sorry.

Did you just simply mean this

prob_neuralode = NeuralODE(dudt, tspan, Tsit5(), saveat = tsteps, ex)

with

it still not works, but I think, it evaluates a little bit further. Now I´m getting the following error message.

ERROR: MethodError: Cannot convert an object of type var"#17#18" to an object of type Array{Array{Float32,1},1}
Closest candidates are:

Do you maybe know the reason of this error message?

Could I get what the code looks like? NeuralODE(dudt, tspan, Tsit5(), saveat = tsteps, ex) doesn’t make sense because keyword arguments have to go before the keyword arguments.

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots

# excitation of the system
no_samples = 100

datasize = 100
tspan = (0.0f0, 9.9f0)
tsteps = range(tspan[1], tspan[2], length = datasize)

ex = t->sin(t)
u = ex.(tsteps)

f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))
function hammerstein_system(u)
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

y = Float32.(hammerstein_system(ex.(tsteps)))

nn_model = FastChain(FastDense(1,50, sigmoid), FastDense(50, 1))

p_model = initial_params(nn_model)

u0 = Float32.([0.0])

function predict_neuralode(p)
  Array(prob_neuralode(u0, p))
end

function dudt(du, u, p, t)
    du[1] = nn_model(ex(t))[1]
end

prob_neuralode = NeuralODE(dudt, tspan, Tsit5(), ex, saveat = tsteps)

prob_neuralode(u0, p_model)

The following defines the UDE that you want:

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots

# excitation of the system
no_samples = 100

datasize = 100
tspan = (0.0f0, 9.9f0)
tsteps = range(tspan[1], tspan[2], length = datasize)

ex = t->sin(t)
u = ex.(tsteps)

f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))
function hammerstein_system(u)
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

y = Float32.(hammerstein_system(ex.(tsteps)))

nn_model = FastChain(FastDense(1,50, sigmoid), FastDense(50, 1))

p_model = initial_params(nn_model)

u0 = Float32.([0.0])

function dudt(du, u, p, t)
    du[1] = nn_model(ex(t),p)[1]
end

prob = ODEProblem(dudt,u0,tspan,nothing)

function predict_neuralode(p)
  _prob = remake(prob,p=p)
  solve(_prob,Tsit5())
end

predict_neuralode(p_model)
2 Likes

Thank you very much. You helped me a lot

I have another question. I created the loss function

#loss(p) = sum(abs2.(y .- [predict_neuralode(p_model).u[i][1] for i in 1:no_samples]))
function loss(p)
    sol = predict_neuralode(p)
    @show (length(sol))
    return sum(abs2.(y[2:end] .- [u[1] for u in sol.u]))
end

res0 = DiffEqFlux.sciml_train(loss,p_model,ADAM(0.005),maxiters=300)

and somehow when I evaluate the loss in the main scope sol contains 100 elements and inside

res0 = DiffEqFlux.sciml_train(loss,p_model,ADAM(0.005),maxiters=300)

just 99 elements.
Because of this I changed y to y[2:end] in the loss function, so that I can train, but I don´t know why sol has different size.

Additional info:
I increased the datasize to 1000 and the tspan to (0.0, 99.9) and evaluated everything again the problem disappeared, but when I reused 100, it appeared again.

Are you using saveat? What is it like? It might be a single floating point error.

yes

function predict_neuralode(p)
  _prob = remake(prob,p=p)
  solve(_prob,Tsit5(), saveat=0.1)
end

and somehow the training isn´t converging. Even the parameters are the same after training.
I checked it like this
grafik

Do you have a tip for me?

Make sure the derivative is non-zero and that you’re using the changed p. It’s hard to diagnose without code.

As for the saveat issue, I’d like to see exactly how you’re hitting it. My guess is that it’s one of these 0.1+0.1+0.1 != 0.3 issues that needs specific floating point handling somewhere.

I thought, this is done by the loss(p) and using

DiffEqFlux.sciml_train(loss,p_model,ADAM(0.005),maxiters=300)

for the training

You mean the derivative, which I would like to fit?
ude_example

I think, it is ensured by the excitation of the system with sin(t). If you see the shape of y in the above diagramm, then I would expect \dot{y} would look like similiar then this. (diff(y, dims=1))

ude_example_diff

Or do you mean something else?

Complete Code:

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots
plotly()
# excitation of the system
datasize = 100
tspan = (0.0f0, 9.9f0)
tsteps = range(tspan[1], tspan[2], length = datasize)

ex = t->sin(t)
u = ex.(tsteps)

f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))
function hammerstein_system(u)
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

y = Float32.(hammerstein_system(ex.(tsteps)))

nn_model = FastChain(FastDense(1,50, sigmoid), FastDense(50,50, sigmoid), FastDense(50, 1))

p_model = initial_params(nn_model)

u0 = Float32.([0.0])

function dudt(du, u, p, t)
    du[1] = nn_model(ex(t),p)[1]
end

prob = ODEProblem(dudt,u0,tspan,nothing)

function predict_neuralode(p)
  _prob = remake(prob,p=p)
  solve(_prob,Tsit5(), saveat=0.1)
end

function loss(p)
    sol = predict_neuralode(p)
    return sum(abs2.(y .- [u[1] for u in sol.u]))/length(sol)
end

p_model_ini = copy(p_model)
res0 = DiffEqFlux.sciml_train(loss,p_model,ADAM(),maxiters=1000)
loss(res0.minimizer)

p_model_ini = copy(p_model)
res0 = DiffEqFlux.sciml_train(loss,p_model,ADAM(),maxiters=1000)
res0.minimizer == p_model_ini
res0.minimizer === p_model_ini

y_model = [u[1] for u in predict_neuralode(res0.minimizer).u]

plot([y y_model], label = ["y" "y_model"], ticks=:native,xlabel="time in s", ylabel="y")

I changed the parameter manually just to see if when I calculate the loss, that the loss will be different. And it is different. So the UDE is reacting on the paramter change. I also looked to your examples to compare the implementations. The example optimal control works fine, but I think the above posted implementation of you is analogously. If I try to debug the code to see whats going inside DiffEqFlux.sciml_train, the debugger will stop at line @withprogress progress name=“Training” begin with an error.

I might get time to take a look later tonight or this weekend.

1 Like

Thank you very much for your time.

I know that I´m maybe a little bit annyoing on this topic, but I´m convinced that UDEs could help me in my task of modeling a stiff system with a data-based approach (this posted problem is just an example and a toy problem to get use to the package). I´m also very pleased for your help and time so far and I can imagine that you have a lot of other things to do. I would like to solve the problem on my own and dig deeper into your implementation, but I found it hard without debugging the code. Is there a general problem with debugging code with macros or is their something wrong with my julia setup,etc. or do I have to do it in a special way (I just use Juno.@enter DiffEqFlux.sciml_train(loss,p_model,ADAM(0.005), maxiters=100) in Atom/Juno) ?
Julia Setup:
Julia v1.5.2
DiffEqFlux v1.23.0
DiffEqSensitivity v6.31.6
DifferentialEquations v6.15.0
Flux v0.11.1
Atom v0.12.23
Juno v0.8.4
Debugger v0.6.6

I can evaluated the code, so that the training is successful, but during debugging it stops in the line 117 of train.jl of DiffEqFlux @withprogress progress name=“Training” begin with the error: ERROR: UndefVarError: ProgressLogging not defined. Additionaly I also tried to train via Flux, but also with Flux the parameters aren´t changing, even though they are assigned to be trainable.

What’s the code you’re running right now?

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots
using LaTeXStrings
plotly()

# excitation of the system
no_samples = 1000

datasize = 1000
tspan = (0.0f0, 99.9f0)
tsteps = range(tspan[1], tspan[2], length = datasize)

ex = t->sin(t)
u = ex.(tsteps)

f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))
function hammerstein_system(u)
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

y = Float32.(hammerstein_system(ex.(tsteps)))

nn_model = FastChain(FastDense(1,50, sigmoid), FastDense(50,50, sigmoid), FastDense(50, 1))

p_model = initial_params(nn_model)

u0 = Float32.([0.0])

function dudt(du, u, p, t)
    du[1] = nn_model(ex(t),p)[1]
end

prob = ODEProblem(dudt,u0,tspan,nothing)

# function predict_neuralode(p)
#   _prob = remake(prob,p=p)
#   solve(_prob,Tsit5(), saveat=0.1)
# end

function predict_neuralode(p)
  #_prob = remake(prob,p=p)
  solve(prob,Tsit5(), p=p, saveat=0.1)
end

loss(p) = sum(abs2.(y .- Array(predict_neuralode(p))))/length(y)

p_model_ini = copy(p_model)
#p_model_ini = rand(Float32, (size(p_model, 1), 1))

res0 = DiffEqFlux.sciml_train(loss,p_model,ADAM(0.005), maxiters=100)

loss(res0.minimizer)
loss(p_model_ini)
res0.minimizer == p_model_ini
res0.minimizer === p_model_ini

Oh wow… Okay.

function predict_neuralode(p)
    _prob = remake(prob,p=p)
    Array(solve(_prob,Tsit5(), saveat=0.1f0))
end

That fixes your problem. It’s a bug because

julia> 99.9 < 99.9f0
true

I need to make it robust so that saveat as a Number in different precision it will align the endpoint of the adjoint method. Oh man, library writing…

1 Like