Training Universal PINNs using NeuralPDE?

I am interested Universal PINNs (UPINNs) to describe some data. UPINNs are similar to a PINN, but we assume that the underlying DE model has known and unknown dynamics:

\dfrac{d\vec{u}}{dt} = F_{known}(\vec{u}) + F_{unknown}(\vec{u}).

To “discover” the unknown dynamics, we replace them with a neural network, F^{NN}[\vec{u}]. In total, we have two neural networks being trained jointly, \vec{u}^{NN}(t), representing the solution to the DE system, and F^{NN}[\vec{u}], representing the unknown dynamics. In particular, I am trying to apply this to data from the Lotka-Volterra Equation:

\dfrac{du_1}{dt} = \alpha u_1 - \beta u_1 u_2 \\ \dfrac{du_2}{dt} = -\delta u_2 + \gamma u_1 u_2 ,

while assuming the interaction terms are unknown. Meaning I’d train the PINN to satisfy

\dfrac{du_1}{dt} = \alpha u_1 - F^{NN}[\vec{u}]_1 \\ \dfrac{du_2}{dt} = -\delta u_2 + F^{NN}[\vec{u}]_2,

and also assume the constants \alpha and \delta are unknown.

I have tried to do this by updating the code from this notebook as follows:

using NeuralPDE, OrdinaryDiffEq, Lux, Random, OptimizationOptimJL, LineSearches, Plots

###
### Generate LV data
###

tspan = (0.0, 5.0)

function lotka!(du, u, p, t)
    α, β, γ, δ = p
    du[1] = α * u[1] - β * u[2] * u[1]
    du[2] = γ * u[1] * u[2] - δ * u[2]
end

u0 = 5.0f0 * rand(rng, 2)
p_ = [1.3, 0.9, 0.8, 1.8]
prob = ODEProblem(lotka!, u0, tspan, p_)
LV_data = solve(prob, Vern7(), abstol = 1e-12, reltol = 1e-12, saveat = 0.25);

t_ = LV_data.t
u_ = reduce(hcat, LV_data.u)

###
### Create neural networks
###


rng = Random.default_rng()
Random.seed!(rng, 0)
n = 15

#NN representing u
chain1 = Chain(Dense(1, n, σ), Dense(n, n, σ), Dense(n, n, σ), Dense(n, 2))
ps1, st1 = Lux.setup(rng, chain1) |> f64
#NN representing unknown dynamics
chain2 = Chain(Dense(2, n, σ), Dense(n, n, σ), Dense(n, n, σ), Dense(n, 2))
ps2, st2 = Lux.setup(rng, chain2) |> f64

additional_loss(phi, θ) = sum(abs2, phi(t_, θ) .- u_) / size(u_, 2);

function lotka_upinns_rhs(u, p, t)
    α, δ = p
    û = chain2(u, ps2, st2)[1] # Network prediction
    return [α * u[1] - û[1],
            û[2] - δ * u[2]]
end

###
### Train the neural networks
###

opt = LBFGS(linesearch = BackTracking())
alg = NNODE(chain1, opt, [ps1,ps2]; strategy = WeightedIntervalTraining([0.7, 0.2, 0.1], 500),
    param_estim = true, additional_loss)

prob = ODEProblem(lotka_upinns_rhs, u0, tspan,[1.0,1.0])
sol = solve(prob, alg, verbose = false, abstol = 1e-8, maxiters = 5000, saveat = t_)

Unfortunately, this gives the following error message

Only homogeneous arrays are allowed.

Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] make_idx(data::Vector{Float64}, x::Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, last_val::Int64)
@ ComponentArrays ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:209
[3] #28
@ ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:161 [inlined]
[4] (::ComponentArrays.var"#28#29"{Vector{Float64}, Base.RefValue{Int64}})(::Pair{Symbol, Vector})
@ ComponentArrays ./none:0
[5] iterate
@ ./generator.jl:47 [inlined]
[6] merge(a::NamedTuple{(), Tuple{}}, itr::Base.Generator{Base.Pairs{Symbol, Vector, Tuple{Symbol, Symbol}, NamedTuple{(:depvar, :p), Tuple{Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Vector{Float64}}}}, ComponentArrays.var"#28#29"{Vector{Float64}, Base.RefValue{Int64}}})
@ Base ./namedtuple.jl:303
[7] make_idx(data::Vector{Float64}, nt::NamedTuple{(:depvar, :p), Tuple{Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Vector{Float64}}}, last_val::Int64)
@ ComponentArrays ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:159
[8] make_carray_args(A::Type{Vector}, nt::NamedTuple{(:depvar, :p), Tuple{Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Vector{Float64}}})
@ ComponentArrays ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:151
[9] make_carray_args(nt::NamedTuple{(:depvar, :p), Tuple{Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Vector{Float64}}})
@ ComponentArrays ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:144
[10] ComponentArrays.ComponentArray(nt::NamedTuple{(:depvar, :p), Tuple{Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Vector{Float64}}})
@ ComponentArrays ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:64
[11] #ComponentArray#21
@ ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:67 [inlined]
[12] ComponentArray
@ ~/.julia/packages/ComponentArrays/fwmmC/src/componentarray.jl:67 [inlined]
[13] __solve(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, false, Vector{Float64}, ODEFunction{false, SciMLBase.AutoSpecialize, typeof(lotka_upinns_rhs), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, SymbolicIndexingInterface.SymbolCache{Nothing, Nothing, Nothing}}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::NNODE{Chain{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), Tuple{Dense{true, typeof(sigmoid_fast), typeof(glorot_uniform), typeof(zeros32)}, Dense{true, typeof(sigmoid_fast), typeof(glorot_uniform), typeof(zeros32)}, Dense{true, typeof(sigmoid_fast), typeof(glorot_uniform), typeof(zeros32)}, Dense{true, typeof(identity), typeof(glorot_uniform), typeof(zeros32)}}}, Nothing}, LBFGS{Nothing, InitialStatic{Float64}, BackTracking{Float64, Int64}, Optim.var"#20#22"}, Vector{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float64}, Matrix{Float64}}}}}}, Nothing, Bool, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, typeof(additional_loss), WeightedIntervalTraining{Float64}}; dt::Nothing, timeseries_errors::Bool, save_everystep::Bool, adaptive::Bool, abstol::Float64, reltol::Float32, verbose::Bool, saveat::Vector{Float64}, maxiters::Int64, tstops::Nothing)
@ NeuralPDE ~/.julia/packages/NeuralPDE/5b5Gb/src/ode_solve.jl:374
[14] __solve
@ ~/.julia/packages/NeuralPDE/5b5Gb/src/ode_solve.jl:342 [inlined]
[15] #solve_call#34
@ ~/.julia/packages/DiffEqBase/eTCPy/src/solve.jl:608 [inlined]
[16] solve_call
@ ~/.julia/packages/DiffEqBase/eTCPy/src/solve.jl:566 [inlined]
[17] #solve_up#42
@ ~/.julia/packages/DiffEqBase/eTCPy/src/solve.jl:1057 [inlined]
[18] solve_up
@ ~/.julia/packages/DiffEqBase/eTCPy/src/solve.jl:1043 [inlined]
[19] #solve#40
@ ~/.julia/packages/DiffEqBase/eTCPy/src/solve.jl:980 [inlined]
[20] top-level scope
@ In[10]:27

And I have no idea how to fix this, or if it is even possible to do this using NeuralPDE

It looks like something goes wrong when trying to generate a component array of the NN parameters (it looks like somewhere the assumption was made that all layers have the same amounts of weights and biases, which is not the case). It could be a bug in how NeuralPDE.jl uses ComponentArrays.jl or in ComponentArrays.jl itself. Are your packages up to date?

Yes, my julia version is 1.11.2, and my NeuralPDE version is 5.17.0, so I don’t think versions should be the issue.

One thing I wasn’t sure how to handle is sending the parameters from both NNs (ps1 and ps2) to NNODE, because I only saw examples online using a single NN. So I entered [ps1,ps2] but am not sure that’s the right thing to do.

Why do you need two separate networks? Why couldn’t a single F^{NN}([\vec{u}]) with two outputs do the trick? Then there’d be no issues with trying to pass initial parameters for multiple networks to NNODE which seems to be built for training a single chain, rather than multiple (see types in ODE-Specialized Physics-Informed Neural Network (PINN) Solver · NeuralPDE.jl)?

EDIT: I think the issue might be that based on the architecture of chain1 NNODE expects initial parameters of certain dimensionality whereas it gets something different.

Yes, for what I’m trying to do, I need two Neural networks.

The first, \vec{u}^{NN}(t), (represented by chain1) models the solution, \vec{u}(t) to the ODE system. It should have a 1-dimensional input (time) and a two-dimensional output (\vec{u}=(u_1,u_2)^T). The second, F^{NN}[\vec{u}], (represented by chain2) models the unknown ODE system dynamics given by (\beta u_1 u_2, \gamma u_1 u_2)^T. Its input should be 2-dimensional (\vec{u}), and its output should be 2-dimensional ((\beta u_1 u_2, \gamma u_1 u_2)^T).

I agree that it may simply not be possible to do this with NNODE, I just wanted to check here before attempting to build it from scratch.