Universal differential equations with MTK

The SciML documentation has this great tutorial on Automatically Discover Missing Physics by Embedding Machine Learning into Differential Equations

In the tutorial the universal differential equation is defined as follows:

## Universal Differential Equation
rbf(x) = exp.(-(x .^ 2))

# Multilayer FeedForward
U = Lux.Chain(
    Lux.Dense(2, 5, rbf),
    Lux.Dense(5, 5, rbf),
    Lux.Dense(5, 5, rbf),
    Lux.Dense(5, 2)
# Get the initial parameters and state variables of the model
p, st = Lux.setup(rng, U)

# Define the hybrid model
function ude_dynamics!(du, u, p, t, p_true)
    û = U(u, p, st)[1] # Network prediction
    du[1] = p_true[1] * u[1] + û[1]
    du[2] = -p_true[4] * u[2] + û[2]

I was wondering on how to implement this, using ModelingToolkit.jl to define the ODE. Specially since, I’m not sure how I could then pass the NN parameters with remake (here as an example the step in the tutorial).

function predict(θ, X=Xₙ[:, 1], T=t)
    _prob = remake(prob_nn, u0=X, tspan=(T[1], T[end]), p=θ)
    Array(solve(_prob, Vern7(), saveat=T,
        abstol=1e-6, reltol=1e-6))

I’m not an authority on this, but my understanding is that MTK is not yet ready for UDEs. I suspect MTK for UDEs is in the works, though.

It’s not ready yet.


That would be a killer combination (and I guess the long term goal for SciML). Any plans on when this would be available?

And how big of a challenge would be to implement that? Would it be enough to annotate some parameters as not Num (instead a NamedTuple or Any?) and also somehow tell MTK not to symbolicaly dive into the NN? Am I missing something here?

Yes, many many plans on this. It’s actually in development: this is JuliaSimCompiler.


It ends up being a very difficult problem (I would say one of my main concerns since about 2019?) so it’s being developed as its own compiler that is applied to ModelingToolkit models. So the input format is exactly the same, but it’s a completely separate codegen that is then able to compile large models (PDEs, large component systems) more than an order of magnitude faster (in the near future, many examples will go from O(n^2) code to O(1) code size!) and generate faster code. It is free for academic and non-commercial use and there’s free trials for JuliaSim for commercial use.

This is a crucial aspect because the ModelingToolkit → Symbolics codegen is naive in that it directly performs a scalarization of the code and generates a code in terms of its component parts. The reason of course is that symbolic analysis is generally much simpler on the scalars.This means that even simple expressions blow up:

using Symbolics
@variables A[1:3,1:3] x[1:3]
build_function(collect(A*x .+ x), A, x)
:(function (ˍ₋out, A, x)
      #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:373 =#
      #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:374 =#
      #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:375 =#
          #= C:\Users\accou\.julia\packages\Symbolics\BQlmn\src\build_function.jl:519 =#
          #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:422 =# @inbounds begin
                  #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:418 =#
                  ˍ₋out[1] = (+)((+)((+)((*)((getindex)(x, 1), (getindex)(A, 1, 1)), (*)((getindex)(x, 2), (getindex)(A, 1, 2))), (*)((getindex)(x, 3), (getindex)(A, 1, 3))), (getindex)(x, 1))
                  ˍ₋out[2] = (+)((+)((+)((*)((getindex)(x, 1), (getindex)(A, 2, 1)), (*)((getindex)(x, 2), (getindex)(A, 2, 2))), (*)((getindex)(x, 3), (getindex)(A, 2, 3))), (getindex)(x, 2))
                  ˍ₋out[3] = (+)((+)((+)((*)((getindex)(x, 1), (getindex)(A, 3, 1)), (*)((getindex)(x, 2), (getindex)(A, 3, 2))), (*)((getindex)(x, 3), (getindex)(A, 3, 3))), (getindex)(x, 3))
                  #= C:\Users\accou\.julia\packages\SymbolicUtils\Oyu8Z\src\code.jl:420 =#

and is the same behavior seen in other acausal modeling systems like Modelica compilers. Yes, Symbolics.jl does have ways of retaining structure, but the issue is that the structural simplification algorithms have a way of recovering structure post Pantelides and tearing all of that. You can even prove that in general, you need to prove properties about the incidence matrix in some “scalar” sense in order for this to be correct, and therefore you need some way of handling mixed representations and recover the structure.

This is something we are spending a ton of time trying to solve, and JuliaSimCompiler is a new ground-up compiler framework for ModelingToolkit models that can retain loops and structures to generate the non-scalarized code. With that in mind, the array function functionality of Symbolics can then be retained all of the way through, treating NN(u) as just some array function, and generating optimized for for this without having to scalarize all of its operations.

Without this its not actually practical, “medium” sized neural networks have 100x100 matrices of weights which then multiplied by vectors grows expressions by O(100^2) and so you just have huge equations that LLVM doesn’t know what to do with. So solving this problem with JuliaSimCompiler both enables PDEs in acausal modeling and (with very similar/related structure pieces) UDEs.

So… stay tuned as there should be some major announcements before the end of the year on doing exactly this kind of integration. ModelingToolkit and UDEs have been two major disconnected parts of the SciML universe for awhile now, and I will be very happy to see the connection finally completed.