Hello everyone! I’m currently looking to develop my own customised Julia code for training Neural ODEs using a Physics-Informed approach. I want to create an interactive PDE-based model; however, there are two constraints, namely:
- Using a PDE system to train a Physics-Informed DeepONet is heavy on computational resources. Due to financial constraints, I’m using only my personal laptop which has CPU but not GPUs, at least not in any significant sense - I do plan to get my own GPU-based machine in the next few months, but I need to save money right now.
- Neural ODEs are powerful SciML tools for modelling dynamic systems. However, owing to the fact that they are Initial Value Problems, they do not explicitly incorporate spatial heterogeneity - this is a major drawback when modelling spatiotemporal systems.
To that end, I thought of a strategy: first, I solve my PDE system using the Method-Of-Lines. Post this, the result that I get from the above computation is what I intend to serve as a dataset to train my Neural ODEs - the reason being that Neural ODEs allow interactive exploration of the model. Due to the way the Method-Of-Lines is structured, suppose there a m data points in x, n in y, and p in z, I would end up with (mnp) spatial coordinate points. Also, if there are t data points in time, then that means there should be t data points in t, and t data points in each of my state variables, and all these for each of the (mnp) spatial coordinate points.
Accordingly, I decided to try this approach with some tutorial codes. I decided to use this tutorial code for the Brusselator PDE system as given in the MethodOfLines.jl documentation.
Next, my idea was to modify the code given in the DiffEqFlux.jl documentation, as regards the example where weather data is fed into the model, which learns the Neural ODEs governing the same.
I used the code in the latter link after modifying it in a Jupyter notebook, the code for which is as follows:
using ModelingToolkit, MethodOfLines, OrdinaryDiffEq, DomainSets
@parameters x y t
@variables u(..) v(..)
Dt = Differential(t)
Dx = Differential(x)
Dy = Differential(y)
Dxx = Differential(x)^2
Dyy = Differential(y)^2
∇²(u) = Dxx(u) + Dyy(u)
brusselator_f(x, y, t) = (((x-0.3)^2 + (y-0.6)^2) <= 0.1^2) * (t >= 1.1) * 5.
x_min = y_min = t_min = 0.0
x_max = y_max = 1.0
t_max = 11.5
α = 10.
u0(x,y,t) = 22(y*(1-y))^(3/2)
v0(x,y,t) = 27(x*(1-x))^(3/2)
eq = [Dt(u(x,y,t)) ~ 1. + v(x,y,t)*u(x,y,t)^2 - 4.4*u(x,y,t) + α*∇²(u(x,y,t)) + brusselator_f(x, y, t),
Dt(v(x,y,t)) ~ 3.4*u(x,y,t) - v(x,y,t)*u(x,y,t)^2 + α*∇²(v(x,y,t))]
domains = [x ∈ Interval(x_min, x_max),
y ∈ Interval(y_min, y_max),
t ∈ Interval(t_min, t_max)]
# Periodic BCs
bcs = [u(x,y,0) ~ u0(x,y,0),
u(0,y,t) ~ u(1,y,t),
u(x,0,t) ~ u(x,1,t),
v(x,y,0) ~ v0(x,y,0),
v(0,y,t) ~ v(1,y,t),
v(x,0,t) ~ v(x,1,t)]
@named pdesys = PDESystem(eq,bcs,domains,[x,y,t],[u(x,y,t),v(x,y,t)])
N = 32
order = 2 # This may be increased to improve accuracy of some schemes
# Integers for x and y are interpreted as number of points. Use a Float to directtly specify stepsizes dx and dy.
discretization = MOLFiniteDifference([x=>N, y=>N], t, approx_order=order)
# Convert the PDE problem into an ODE problem
println("Discretization:")
@time prob = discretize(pdesys,discretization)
println("Solve:")
@time sol = solve(prob, TRBDF2(), saveat=0.1)
discrete_x = sol[x]
discrete_y = sol[y]
discrete_t = sol[t]
solu = sol[u(x, y, t)]
solv = sol[v(x, y, t)]
# Check for missing values in u and v
println("Number of missing values in u: ", sum(ismissing.(sol[u(x,y,t)])))
println("Number of missing values in v: ", sum(ismissing.(sol[v(x,y,t)])))
println("Number of missing values in t: ", sum(ismissing.(sol[t])))
println("Number of missing values in x: ", sum(ismissing.(sol[x])))
println("Number of missing values in y: ", sum(ismissing.(sol[y])))
# Check for NaN values (common if solver failed)
println("Number of NaN in u: ", sum(isnan.(sol[u(x,y,t)])))
println("Number of NaN in v: ", sum(isnan.(sol[v(x,y,t)])))
println("Number of NaN in t: ", sum(isnan.(sol[t])))
println("Number of NaN in x: ", sum(isnan.(sol[x])))
println("Number of NaN in y: ", sum(isnan.(sol[y])))
using Random, Dates, Optimization, ComponentArrays, Lux, OptimizationOptimisers, DiffEqFlux, OrdinaryDiffEq, CSV, DataFrames, Dates, Statistics
function train_custom_neural_ode(data, i, j)
a = data
# discrete_x = a[x]
# discrete_y = a[y]
# discrete_t = a[t]
solu = a[u(x, y, t)][i, j, :] # (Nt,)
solv = a[v(x, y, t)][i, j, :] # (Nt,)
discrete_t = a[t] # (Nt,)
# raw_df = DataFrame(time = discrete_t, u = solu, v = solv)
FEATURES = [:solu, :solv]
raw_df = [:discrete_t]
function standardize(x)
μ = mean(x; dims = 2)
σ = std(x; dims = 2)
z = (x .- μ) ./ σ
return z, μ, σ
end
function featurize(raw_df, FEATURES; num_train = 100)
t = raw_df[:]
y = Matrix(select(raw_df, FEATURES))'
n_total = length(t)
n_train = min(num_train, n_total)
if n_total < num_train
@warn "Only $n_total time points available, less than requested $num_train. Using $n_total for training."
end
t_train = t[1:n_train]
y_train = y[:, 1:n_train]
t_test = t[(n_train+1):end]
y_test = y[:, (n_train+1):end]
# Standardize as before...
t_train, t_mean, t_scale = standardize(t_train)
y_train, y_mean, y_scale = standardize(y_train)
t_test = (t_test .- t_mean) ./ t_scale
y_test = (y_test .- y_mean) ./ y_scale
return (vec(t_train), y_train, vec(t_test), y_test, (t_mean, t_scale), (y_mean, y_scale))
end
""" function featurize(raw_df, FEATURES; num_train = 100)
# Extract time and features directly
t = raw_df
y = Matrix(select(raw_df, FEATURES))'
# Split into train and test
t_train = t[1:num_train]
y_train = y[:, 1:num_train]
t_test = t[(num_train+1):end]
y_test = y[:, (num_train+1):end]
# Standardize
t_train, t_mean, t_scale = standardize(t_train)
y_train, y_mean, y_scale = standardize(y_train)
t_test = (t_test .- t_mean) ./ t_scale
y_test = (y_test .- y_mean) ./ y_scale
return (
vec(t_train), y_train, vec(t_test), y_test, (t_mean, t_scale), (y_mean, y_scale)
)
end """
t_train, y_train, t_test, y_test, (t_mean, t_scale), (y_mean, y_scale) = featurize(raw_df, FEATURES; num_train=100)
# t_train, y_train, t_test, y_test, (t_mean, t_scale), (y_mean, y_scale) = featurize(raw_df, FEATURES; num_train=16)
function train_one_round(node, p, state, y, opt, maxiters, rng, y0 = y[:, 1]; kwargs...)
predict(p) = Array(node(y0, p, state)[1])
loss(p) = sum(abs2, predict(p) .- y)
adtype = Optimization.AutoZygote()
optf = OptimizationFunction((p, _) -> loss(p), adtype)
optprob = OptimizationProblem(optf, p)
res = solve(optprob, opt; maxiters = maxiters, kwargs...)
res.minimizer, state
end
function train(t, y, obs_grid, maxiters, lr, rng, p = nothing, state = nothing; kwargs...)
log_results(ps, losses) = (state, loss) -> begin
push!(ps, copy(state.u))
push!(losses, loss)
false
end
ps, losses = ComponentArray[], Float32[]
for k in obs_grid
node, p_new, state_new = neural_ode(t, size(y, 1))
p === nothing && (p = p_new)
state === nothing && (state = state_new)
p, state = train_one_round(node, p, state, y, OptimizationOptimisers.AdamW(lr),
maxiters, rng; callback = log_results(ps, losses), kwargs...)
end
ps, state, losses
end
rng = MersenneTwister(123)
obs_grid = 4:4:length(t_train) # we train on an increasing amount of the first k obs
maxiters = 150
lr = 5e-3
ps, state, losses = train(t_train, y_train, obs_grid, maxiters, lr, rng; progress = true)
return ps, state, losses
end
struct NODETrainResult
ix::Int
iy::Int
x::Float64
y::Float64
ps
state
losses
end
Up until this point, the code ran successfully, both in terms of meaningful output as well as code syntax. However, when I entered the next block of code:
results = NODETrainResult[]
x_grid = sol[x]
y_grid = sol[y]
for ix in 1:length(x_grid)
for iy in 1:length(y_grid)
try
ps, state, losses = train_custom_neural_ode(sol, ix, iy)
push!(results, NODETrainResult(ix, iy, x_grid[ix], y_grid[iy], ps, state, losses))
catch e
@warn "NODE training failed at (ix=$ix, iy=$iy): $e"
end
end
end
I got:
┌ Warning: NODE training failed at (ix=1, iy=1): MethodError(DataFrames.select, ([:discrete_t], [:solu, :solv]), 0x0000000000007d38)
└ @ Main In[12]:11
The above warning was repeated for each and every member of the 1024 spatial points.
Could anyone help me figure out what’s wrong with the code? I mean, it’s running, but it’s not giving a proper output. I’d really appreciate any help in this regard, with suggestions for editing the code.