Forecasting time series data with neural ordinary differential equations in Julia

Dear all,

I am a new user of Julia and following a tutorial on NeuralODE (please see the tutorial: Experiments with Neural ODEs in Julia).

How can I forecast ENSO data (time series data in a CSV file) using the above tutorial? In fact, I would like to know how to use real data for NODEs in Julia.

Note: I could not reproduce a tutorial, so called Forecasting the weather with neural ODEs (Forecasting the weather with neural ODEs | Sebastian Callh personal blog).

Regards,
Jalal

What did you try?

I tried to follow this tutorial: Experiments with Neural ODEs in Julia.

Data download link (monthly enso): Dropbox - monthly_enso.csv - Simplify your life

using Flux, DiffEqFlux, DifferentialEquations, Plots
using DataFrames, CSV

data = CSV.read(“monthly_enso.csv”,DataFrame)
first(data,5)
println(size(data))

tbegin = 0.0
tend = 456
t = range(tbegin,tend,length=tend)
u0 = [2.5; 0.5]
tspan = (tbegin,tend)
trange = range(tbegin,tend,length=tend)
dataset_ts = data

math_law(u) = sin.(2. * u) + cos.(2. * u)
dudt = Chain(u → math_law(u),Dense(2, 50, tanh),Dense(50, 2))

reltol = 1e-7 # tol = tolerances
abstol = 1e-9
n_ode = NeuralODE(dudt, tspan, Tsit5(), saveat=trange, reltol=reltol,abstol=abstol)
ps = Flux.params(n_ode.p)

function loss_n_ode()
pred = n_ode(u0)
loss = sum(abs2, dataset_ts .- pred)
end

n_epochs = 400
learning_rate = 0.01
data = Iterators.repeated((), n_epochs)
opt = ADAM(learning_rate)

cb = function () # callback function to observe training
loss = loss_n_ode()
println("Loss: ", loss)
end

println();

cb() # Display the ODE with the initial parameter values.

Flux.train!(loss_n_ode, ps, data, opt, cb=cb)

pl = plot(
trange,
dataset_ts[1,:],
linewidth=2, ls=:dash,
title=“Neural ODE for forecasting”,
xaxis=“t”,
label=“original timeseries x(t)”,
legend=:right)
display(pl)

pl = plot!(
trange,
dataset_ts[2,:],
linewidth=2, ls=:dash,
label=“original timeseries y(t)”)
display(pl)

pred = n_ode(u0)

pl = plot!(
trange,
pred[1,:],
linewidth=1,
label=“predicted timeseries x(t)”)
display(pl)

pl = plot!(
trange,
pred[2,:],
linewidth=1,
label=“predicted timeseries y(t)”)
display(pl)

Hi Jalal!

I’m the author of the blog you linked to and wonder if you could tell me more about how you could not reproduce it. I’m happy to assist with it and update the blog if it is misleading somehow.

1 Like

Hi Sebastian Callh,

I used the same data you used for the tutorial. However, the following codes show the ‘year and month’ are undefined. Could you check, please?

delhi[:,:year] = Float64.(year.(delhi[:,:date]))
delhi[:,:month] = Float64.(month.(delhi[:,:date]))
df_mean = by(delhi, [:year, :month],
:meantemp => mean,
:humidity => mean,
:wind_speed => mean,
:meanpressure => mean)
rename!(df_mean, [:year, :month, :meantemp,
:humidity, :wind_speed, :meanpressure])

df_mean[!,:date] .= df_mean[:,:year] .+ df_mean[:,:month] ./ 12;

I would be very happy if I can use your tutorial for my research. I look forward to your reply.

Regards,
Jalal

That just looks like a case of outdated DataFrames syntax - the post is two and a half years old so probably uses a pre 1.0 version of DataFrames.

by has been deprecated, so you want to replace that line with

df_mean = combine(groupby(delhi, [:year, :month]),
    [:meantemp, :humidity, :wind_speed, :meanpressure] .=> mean) 
2 Likes

Ohh yeah there’s a lot of old Julia in there. I’ll try to make time to update it over the weekend.
In the meantime, the code on Github (which is linked from the blogpost) was updated a few months ago to use Lux.jl and Optimization.jl and should play nicely with modern Julia code.

2 Likes

Can we make this into an example in the DiffEqFlux docs? The docs are tested with every code change so that will be more robust. We can link back to your blog and papers from there, but this would help make sure it’s kept up to date.

2 Likes

I’m happy to if it is helpful to people. Do you have any instructions on how? I see there are .md files in DiffEqFlux.jl/src/examples. Is it enough to PR an .md document? Where should images be hosted?

1 Like

Yup

They get generated during the doc build (it runs the code)

2 Likes

Thank you so much for your kind response. I am getting the same problem.

And that’s why it’s helpful to provide stack traces! This is not about a variable called year but the function year, which is from the Dates standard library so you are missing using Dates

2 Likes

Thank you so much. It works now. But I stucked on the final step.

[Warning: FastChain is being deprecated in favor of Lux.jl. Lux.jl uses functions with explicit parameters f(u,p) like FastChain, but is fully featured and documented machine learning library. See the Lux.jl documentation for more details].

And [Warning: sciml_train is being deprecated in favor of direct usage of Optimization.jl. Please consult the Optimization.jl documentation for more details. Optimization.jl’s PolyOpt solver is the polyalgorithm of sciml_train].

1 Like

Those are just warnings about upcoming deprecations (well, now the deprecations are soon because the update process started about 2 years ago :sweat_smile:). We will work with @SebastianCallh to make an updated example be in the docs.

2 Likes

Quick question @ChrisRackauckas . If you’re replacing Flux with Lux where does that leave SimpleChains? Wouldn’t SC be more appropriate for the sciml community in general? My reasoning is that i rarely see really big neural networks in sciml contexts so the speedups in SC on CPU starts looking really attractive. This assumes that I’m correct in thinking that the sciml primary use case is small neural networks on a CPU architecture.

1 Like

Yes, for most SciML use cases people should probably be using SimpleChains. But it’s tricky. SimpleChains is a performance optimization so it’s naturally a less user-friendly library than Flux/Lux. We support all 3 in most places now (note that SciMLSensitivity supports it, but DiffEqFlux’s pre-built layers only Flux or Lux right now).

We should probably highlight SimpleChains in more tutorials than we do right now (and get the DE layers to all allow SimpleChains, that would be worth an issue), but I’d be weary to make it the thing that people grab as a first choice because you do need to be careful when using it. It has support for less types of layers and you need to be careful with multithreading and memory caching in a way you don’t need to with Flux/Lux.

1 Like

Have been pretty busy but just sat down to get cracking on this and realized we don’t want to put data from Kaggle in the example docs, and I honestly do not know how to write down an ODE that produces similar data that we can use in the example to train on. I could produce noisy sample from the model fit in the blog post and hard code those into the docs example. Would that work or does that open the “who owns the output of a model trained on licensed data?” can of worms? If a ground truth ODE is preferred I’d appreciate some help creating it.

I think that would be fine? We use DataDeps.jl in another spot for the MNIST data already.

Seems like DataDeps.jl would works great if only the data was publicly available, but Kaggle requires a user login to be able to access datasets.

What’s the licensing on Kaggle datasets? I’ve never head to deal with that before.