Defining and input signal for a differential equation

I’m trying to use the DiffEqFlux package to define a machine learning model for a differential equation. The model I want needs to work similar to a Nueral ODE, but with latent variables. So I have the following:

\frac{dh}{dt} = F(h_{t-1}, x_t)

Thus when writing my model, I need to make the differential depend on the previous state of the system, but also on the input signal x. It’s not clear to me how to do this with the DifferentialEquations package’s interface. For example:

node = h -> neural_ode(dhdt, h, tspan, Tsit5(), saveat=t,reltol=1e-7,abstol=1e-9)

I need the input be dependant on both h and x, but as far as I can tell, the interface only supports one. The only solution that comes to mind is to define the state as a pair (h, x) and defining the gradient of x to always be 0 and use t to access the correct values of the signal. This seems like a cumbersome way to do this though.

3 Likes

See this previous discussion:

Short answer: the default approach is to use a closure for something like this

neural_ode((h,t) -> dhdt(h,x,t), h, tspan, Tsit5(), saveat=t,reltol=1e-7,abstol=1e-9)

but the DifferentialEquations package also supports passing additional parameters explicitly to ease sensitivity analysis.

By a latent variable, do you mean you want to do a neural delay differential equation?

http://docs.juliadiffeq.org/latest/tutorials/dde_example.html

I don’t think so. As far as I understand it (my understanding of DDE’s is superficial tbh), in DDE’s there is a parameter \tau that specifies the delay with which the dynamics of a variable respond. This does not mean that this variable is latent. When I say latent I mean like an autoencoder (see the end of the Neural ODE paper although they use a probabilistic i.e. variational autoencoder approach). In other words I want something like:

\frac{dh}{dt} = f(h, I(t))

where I(t) is an input signal to the system which depends on time. This would correspond to the input to the neural network (in the toy example I’m working on it would be an image with a time-varying noise mask). However, the current DifferentialEquation package’s interface takes as its input the current state of the variables whose dynamics you are simulating. Hence as @stevengj mentioned, you have to use closures. At the moment I have this:

model = Chain(Dense(Di + Dh, Dh, relu), Dense(Dh, Dh))
function dhdt(X, dh, h, p, t)
    input = [h; X(t)]
    dh .= model(input)
end

node(I, h) = neural_ode(
    (dh, h, p, t) -> dhdt(I, dh, h, p, t),
    h, tspan, Tsit5(), saveat=t,reltol=1e-7,abstol=1e-9
)

It seems a little cumbersome though. Also, after reading the other discussion, I don’t know if this will have an effect on adjoint sensitivity analysis and optimization.

Oh I see what you’re trying to do. If that’s the case then your I(t)'s parameters need to come from p if you want it to be compatible with the adjoint sensitivity analysis. See the section on mixed neural designs for how to do this kind of thing:

Thanks, I saw that part but skipped it.

I’m still a bit confused though. You mention that I(t)'s parameters have to come from p, but what if I(t) does not have any parameters? For example, I compute the signal ahead of time (using another ODE analogously to the first example in DiffEqFlux) since its a stimuli that does not depend on the model. As I understand it, my current approach would suffice.

Oh yes, if I(t) doesn’t have any parameters then you can just enclose it. That’s fine. I thought I(t) was something that was also being updated.

Okay great, thanks for the help. One final question.

The way I seed it, for the NeuralODE models using Flux, it might be a common occurrence that we pass an input and simulate the dynamics of an internal state. This is very common in neuroscience as well. In this case, creating a closure each time we need to pass an input signal or using the approach in the mixed designs examples seems repetitive. Is there any plan to make this input signal part of the interface?

Since using a closure here doesn’t have any performance issues, I am not sure we will specialize too much on the interface for this case. If I(t) is just a normal Julia function then you don’t need to enclose it to use it BTW. If it encloses local data, you can use a call-overloaded type as needed. So I don’t think the syntax is much of an issue and there are other user-side designs that can handle this well.

1 Like

What would this look like? Sorry this is my inexperience with Julia talking. I guess my expectation was something like a recurrent network, which in traditional ML frameworks take an input X, which in this case is I(t) and the hidden state h0.

Also just to mention, the first example of Neural ODEs shown in the DiffEqFlux page (right under the homonymous section) does not work. From the page:

model = Chain(Dense(2,50,tanh),Dense(50,2))
# Define the ODE as the forward pass of the neural network with weights `p`
function dudt(du,u,p,t)
    du .= model(u)
end

tspan = (0.0f0,25.0f0)
x -> neural_ode(dudt,x,tspan,Tsit5(),saveat=0.1)

This fails since neural_ode expects a model taking one parameter (the current state u), not the ones expected by the dudt function (which it builds internally).

I(t) = t^2

function f(du,u,p,t)
  du[1] = I(t)
  du[2] = u[1]
end

Thanks! Fixed.

Thanks for the help! I forked the DiffEqFlux project to play around with it more. I think I might try to implement another way of interfacing with it for convenience after I learn more Julia.

Cheers

@mllera14, do you happen to have a full working example? I’m currently working on the same thing and an example could come in very handy.

Cheers

1 Like

@mllera14 @Beramos I would also appreciate a worked solution if either of you managed to get something like this to work!

You don’t do anything special. You just put the function in there. I’ve received this question enough times so I added a tutorial page on it:

https://diffeqflux.sciml.ai/dev/examples/exogenous_input/

It’s very terse because I don’t get why it wasn’t clear before, so please give me feedback on whether that sufficiently answers the question.

4 Likes

Great, thanks for the link. I am new to Julia and DiffEqFlux and so the tutorial was v useful, especially in the neural ODE case which is where I was struggling. :slight_smile:

Hi! Thank you very much for providing this tutorial, it helped a lot! While I have one question about this piece of code in that tutorial:

function dudt(u, p, t)
    #input_val = u_vals[Int(round(t*10)+1)]
    nn_model(vcat(u[1], ex[Int(round(10*0.1))]), p)
end

My understanding is that nn_model is defined as FastChain layers (x,p), ‘x’ here includes two part: 1) u[1]: the time series that we want to simulate its dynamics and 2) ex[Int(round(10*0.1))]: the input signal of the neural ode.

Shouldn’t the index inside the bracket ex[Int(round(10*0.1))] to be a function of t? As the index at the moment is just a constant number of 1, which means ode solver always takes the initial value of the input signal.

So I think this part coule change to

 function dudt(u, p, t)
    #input_val = u_vals[Int(round(t*10)+1)]
    nn_model(vcat(u[1], ex[Int(round(10*t))]), p)
end

I have tested the same example with this bit of codes, it works fine. However,…

But I am not sure if I understand it correctly, cause I am currently working on the similar task like a normal ML model. I have input features and want to predict output labels. I am considering the input features have the same idea of your exogeneous signal but with higher dimensions. Thus I modified the code to be:

function dudt(u, p, t)
    nn_model(vcat(u[1], input_features[:,Int(round(10*t))]), p)
end

But this does’t get me resaonable predictions. Am I using this correctly? Or maybe I got the idea wrong completely…Appreciate it!!

Yeah you’re using it in a way that looks fine. Whether that’s a good model is to be seen, but it should be numerically fine.

Sure, I got this. Then its more likely to be the other part of my model went wrong or my feature set is not suitable. Thanks for comfirming this for me :laughing:!!