Neural ODE in DiffEqFlux that is not a time series

I am trying to use DiffEqFlux.jl to make a model that takes in a vector of length 200 and passes it through a neural ODE layer to output a vector of length 200.

I tried to do so by tweaking the time series example in the documentation so that the loss only depends on the output of the ODE at the end of the time span. Here is my attempt:

using Flux, DiffEqFlux, DifferentialEquations
# Create a neural ode that takes in an input of length 200 and gives an output
# of length 200.
dudt = Chain(Dense(200,50,tanh),
             Dense(50,200))
tspan = (0f0,1f0)
n_ode = x->neural_ode(dudt,x,tspan,Tsit5(),saveat=tspan[end],save_start=false,
                      reltol=1e-7,abstol=1e-9)
# The loss function will be the squared error of the output.
loss_n_ode(x,y) = sum(abs2,y .- n_ode(x))
loss_n_ode(a::Tuple) = loss_n_ode(a...)
# Create random training data for the model to try fit.
nbatches = 100
batch_size = 32
data = [(randn(Float32,200,batch_size),randn(Float32,200,batch_size)) for _ in 1:nbatches]
# Train the model for one epoch.
opt = ADAM(0.1)
ps = Flux.params(dudt)
Flux.train!(loss_n_ode,ps,data,opt)

The last line of the code gives a very long error that begins with:

ERROR: LoadError: DimensionMismatch("array could not be broadcast to match destination")

Interestingly, if I set batch_size = 1, then I do not get the error, however, the train! step does not mutate my ps like it should.

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

(v1.1) pkg> st
    Status `~/.julia/environments/v1.1/Project.toml`
  [c52e3926] Atom v0.8.2
  [336ed68f] CSV v0.4.3
  [3a865a2d] CuArrays v1.0.1
  [a93c6f00] DataFrames v0.17.1
  [aae7a2af] DiffEqFlux v0.2.0+ #master (https://github.com/JuliaDiffEq/DiffEqFlux.jl.git)
  [0c46a032] DifferentialEquations v6.3.0
  [31c24e10] Distributions v0.17.0
  [587475ba] Flux v0.8.1
  [28b8d3ca] GR v0.38.1
  [7073ff75] IJulia v1.18.0
  [e5e0dc1b] Juno v0.7.0
  [91a5bcdd] Plots v0.23.2
  [d330b81b] PyPlot v2.8.0
  [a759f4b9] TimerOutputs v0.5.0

The problem also occurs when I am not on master of DiffEqFlux.jl.

EDIT One of the comments said that I was training the data. I actually meant model, whoops.

n_ode = x->neural_ode(dudt,x,tspan,Tsit5(),save_everystep=false,save_start=false,
                      reltol=1e-7,abstol=1e-9)
# The loss function will be the squared error of the output.
loss_n_ode(x,y) = sum(abs2,y .- n_ode[end])

I think that’s what you’re looking for? I can’t run it right now.

Hi Chris,

Thanks for the quick response! Using your code, I still get the same error. I slightly tweaked your suggestion so that the loss would still compare the full output vector to y. This is what I have for the loss now:

function loss_n_ode(x,y)
    if ndims(x) == 1
        return sum(abs2,y .- n_ode(x)[:,end])
    elseif ndims(x) == 2
        return sum(abs2,y .- n_ode(x)[:,:,end])
    else
        error("Dimension $(ndims(x)) is bad")
    end
end

I should be more clear about my problem. Out of all of the loss functions that have been mentioned do far, they all run fine. So for example, if we use either of the loss functions that I suggested, then we have:

julia> loss_n_ode(rand(200),rand(200))
57.96308190934719 (tracked)

julia> loss_n_ode(rand(200,32),rand(200,32))
1932.8472895975965 (tracked)

We also have that:

julia> n_ode(rand(200))
Tracked 200×1 Array{Float64,2}:
...  # Ommitting the actual array for brevity 
julia> n_ode(rand(200,32))
Tracked 200×32×1 Array{Float64,3}:
...  # Ommitting the actual array for brevity

The problem only occurs when I make the call to train!.

sol[end] is the whole end vector. My suggest works for arbitrary n dimensions.

What’s the problem in train!?

sol[end] is the whole end vector. My suggest works for arbitrary n dimensions.

But when I try it, I get

julia> n_ode(rand(200))[end]
0.2948478592581806 (tracked)

What’s the problem in train! ?

If I train the model with batch_size=1, then it will run without error, but ps (the parameters of my model) will not be changed. If I run train! with batch_size>1, then I get an error beginning with

ERROR: LoadError: DimensionMismatch("array could not be broadcast to match destination")

A gist of the error is here

I think that I have made some progress:

using Flux, DiffEqFlux, DifferentialEquations
# Create a neural ode that takes in an input of length 200 and gives an output
# of length 200.
dudt = Chain(Dense(200,50,tanh),
             Dense(50,200))
tspan = (0f0,1f0)
n_ode = x->neural_ode_rd(dudt,x,tspan,Tsit5(),save_everystep=false,save_start=false,
                         reltol=1e-7,abstol=1e-9)
# The loss function will be the squared error of the output.
loss_n_ode(x,y) = sum(abs2,y .- n_ode(x))
loss_n_ode(a::Tuple) = loss_n_ode(a...)
# Create random training data for the model to try fit.
nbatches = 10
batch_size = 32
data = [(randn(Float32,200,batch_size),randn(Float32,200,batch_size)) for _ in 1:nbatches]
# Train the model for one epoch.
opt = ADAM(0.1)
ps = Flux.params(dudt)

display(sum(loss_n_ode.(data)))
Flux.train!(loss_n_ode,ps,data,opt)
display(sum(loss_n_ode.(data)))


Upon running the code, it prints the following loss before and after train!:

140091.98f0 (tracked)
105187.28f0 (tracked)

So it seems that we are training! I need to do more testing though.

EDIT 2 the difference here is that now I am using neural_ode_rd rather than neural_ode. I am still not sure why neural_ode does not work.

Oh yes, my bad. Neural ODE returns the array and not the DESolution because of the constraints on it’s AD