DifferentialEquations.jl + Flux.jl or Knet.jl

diffeq

#1

I am trying to estimate various dynamical systems using neural network models. I am very interested in using the integrators provided by DifferentialEquations.jl to do this, and the docs for that package even have nice example on parameter estimation . If possible, I would like to use DifferentialEquations with Flux.jl, but so far, I cannot seem to get any example working. Here is the code I am trying

using DifferentialEquations
using Flux
using Flux.Tracker


b = param(-1.0)
f(t, x) = b.*x

u0 = param(1.0)
prob = ODEProblem(f, u0, (0, 1.0))
sol = solve(prob)
# output:
# ERROR: MethodError: Cannot `convert` an object of type Array{Float64,0} to an object of type TrackedArray{…,Array{Float64,0}}
# This may have arisen from a call to the constructor TrackedArray{…,Array{Float64,0}}(...),
# since type constructors fall back to convert methods.

Ultimately, I want to compare to compute l = loss(sol.u, u_truth) and call back!(l) to compute the gradients with respect to the parameters using Flux. Is this possible in principal?


#2

In principle, yes. In practice, it depends on how much Autograd.jl (for KNet.jl) or Flux.jl support in their backpropagation algorithms. I think it would be easier to just use ForwardDiff.jl for the gradients (which we already know works) for now, and use that with SGD or ADAM.

But if you want to poke around on this, I suggest getting onto the Slack so we can work through this. It’ll likely take some new method definitions for the backprop stuff, but shouldn’t be too bad?


#3

Thanks for replying Chris.

FWIW I just tried the same thing in KNet and I get a similar error:

ERROR: MethodError: Cannot `convert` an object of type AutoGrad.Rec{Float64} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:

I think it would be easier to just use ForwardDiff.jl for the gradients

Unfortunately, I think backward derivatives are needed for computational reasons because neural networks have so many free parameters.

I would be interested in trying to get this to work, but I am pretty new to julia.


#4

You’re free to mix and match forward and backward, and different AD techniques generally, as you please. Assuming you have a neural network generating b which you then feed into a solver, you can do something like:

  • Run the forward pass of the network to generate b as a tracked vector
  • Unwrap b and create a dual number b + ϵ
  • Run the solver and calculate the loss to get loss + dloss/db*ϵ
  • Call back!(b, dloss/db) to have flux calculate dloss/dparam for each net parameter.

There’s going to be a bit of plumbing, but we can help with that. And if it’s working well we can find ways to make Flux/DiffEq integration smoother.


#5

What is your use-case?


#6

I am asking because I saw that you are also working with dynamic noise/stochastic differential equations, e.g. Ornstein-Uhlenbeck processes, which I am working on.


#7

Hi Chris,

Thanks for your great work on DifferentialEquations.jl! The possibility to get a gradient of the solution of an ODE is really mind-blowing :slight_smile:

You mention here that it takes 5 line of code to use DifferentialEquations.jl with Flux.jl. Would you mind sharing an example?