Detecting Function Composition in Symbolics.jl

nicrummel · June 19, 2024, 8:40pm

Motivation

I have a system of differential equations
\dot{U} = F(U,w)
where U\in \mathbb{R}^D are dependent variables and w \in\mathbb{R}^J are parameters. And I can quickly build (in-place) Julia functions from Symbolics.jl and ModelingToolkit.jl of the following functions of U and w:
F(U,w), \; \nabla_uF(U,w), \;\nabla_wF(U,W), and \nabla_w[\nabla_uF](U,w)
I then have data \{u^{(m)}\}_{m=1}^M, where M\in \mathbb{N} is large. I don’t want assume anything about F a priori, but I need to evaluate all four of these functions at every u^{(m)} for different w during an optimization routine. This is quickly becoming expensive…

Goal

I would like to further speed up evaluating these functions by auto detecting
\nabla_uF[u_m,w] = h(g(u), w)

Concrete Example (Hindmarsh Rose)

This example currently is causing too much of slow down, but larger systems are very slow. Hopefully it can help ground the discussion:
\dot{U} = F(U,w) = \left[\begin{array}{l} {w}_1 U_2+w_2 U_1^3+w_3 U_1^2+w_4 U_3 \\ w_5+w_6 U_1^2+w_7 U_2 \\ w_8 U_1+w_9+w_{10} U_3 \end{array}\right]
\nabla_U F(U,w) = \left[\begin{array}{ccc} 3 U_1^2 w_2 + 2U_1w_3 & {w}_1 & w_4 \\ 2U_1 w_6 & w_7 & 0 \\ w_8 & 0 & w_{10} \end{array}\right]
I want to programmatically recognize that
g(U) = \left[\begin{array}{c} 3U_1^2\\ 2U_1 \end{array}\right]
Then we can precompute \{y^{(m)}\}_{m=1}^M := \{g(u^{(m)})\}_{m=1}^M and then quickly compute
\begin{align*}\nabla_UF(u^{(m)},w) &= h(g(u^{(m)},w) = h(y^{(m)},w) \\ &= \left[\begin{array}{ccc} y^{(m)}_1 w_2 + y_2^{(m)}w_3 & {w}_1 & w_4 \\ y_2^{(m)} w_6 & w_7 & 0 \\ w_8 & 0 & w_{10} \end{array}\right] \end{align*}

Ideas on how to start

There should be a way to isolate each expressions that contain just U and scalars. We would need to leverage when operators are/aren’t associative, commutative, etc… In the example this would result in [3U_1^2, 2U_1,2U_1].
Then we define a vector of the unique expressions. In the example: [3U_1^2, 2U_1,] .
Substitute the y_i in the original equation for each expression to define h(y,w).

I am having a hard time getting started in the Symbolics API with what I can see from the documentation. Perhaps there is way to do this in sympy through SymPy.jl? Any help or starting points would be a great help!

ChrisRackauckas · June 20, 2024, 10:44am

This is just common subexpression elimination?

nicrummel · June 22, 2024, 12:33am

@ChrisRackauckas I think so! This is readily available in pysym (link goes to relevant documentation). It looks like others have used SymbolicUtils.jl and the (unmaintained) SymbolicCodegen.jl to do CSE. It can be found in this show and tell.

My code already has symbolic expression from Symbolics.jl, Is there a way to use the high level module to:

Find the common subexpressions
Build Julia functions of each subexpression
Substitute the expressions back into the original
Build a function out of the final expression

Or should I bring in some SymPy.jl

ChrisRackauckas · June 22, 2024, 3:08pm

If you use build_function, then you just add cse=true. We don’t do it by default though because there’s an unknown bug according to @shashi IIRC, but I don’t know what it is or how to trigger it so if you can help figure out what that is we can just make it standard (since it’s a standard codegen optimization, even Julia will run CSE on the generated code).

nicrummel · July 3, 2024, 4:49pm

Hey @ChrisRackauckas,

Just following up on this CSE discussion. (I know that most of this explanation is not necessary for you, but I am putting it here so if someone else is trying to follow the discussion they can). Here is what I have done:

I looked into turning on the key word argument cse, for build_function. The outputs are two functions, (f,f!) , where the latter is an in place function. The allocating function f appears to have the proper common subexpression elimination, while f! does not. This may be the “bug”, that was mentioned. I probably can go make the necessary changed in Symbolics.jl to have this function properly, but have not done so yet.
For my use case, even if f! worked properly this is not the full solution. Turning cse=true would produce (using the notation above):
```
function f(u,w)
    y = g(u,w)
    return h(y)
end
```
This not the desired result for my use case. I need something slightly different. Because I am trying to run optimization methods to find the “best” w^* I want to evaluate f at fixed u's but many different w. I need the subexpression that only contain u and scalars, g(u), so I can produce \{y^{(m)}\}_{m=1}^M := \{g(u^{(m)}\}_{m=1}^M. With these precomputed y^{(m)}, I can call h(y,w) which should have fewer allocations and runtime.

To accomplish this goal I have tried to use some prebuilt tools in Symbolics.jl and SymPy (through SymPy.jl). SymPy is a pain for my application because I am taking Symbolics.jl types turning them into SymPy types then back. So there is a lot of plumbing which adds overhead and decreases readability. Furthermore, neither Symbolilics.jl or SymPy gives me the result I am looking for. These tools are not good at isolating one variable from the other. I dev’d Symbolics.jl module to extend the coeff function to handle a broader class of expressions, but it honestly is patch work. I am looking for something more than just subexpressions that are coefficients.

Do you have any suggestions on how to proceed? Thanks again for your time.

ChrisRackauckas · July 5, 2024, 6:06pm

Yeah that would be a bug. Open an issue?

nicrummel:

This not the desired result for my use case. I need something slightly different. Because I am trying to run optimization methods to find the “best” w^w∗w^ I want to evaluate fff at fixed uuu’s but many different www. I need the subexpression that only contain uuu and scalars, g(u)g(u)g(u), so I can produce {y^{(m)}}{m=1}^M := {g(u^{(m)}}{m=1}^M{y(m)}Mm=1:={g(u(m)}Mm=1{y^{(m)}}{m=1}^M := {g(u^{(m)}}{m=1}^M. With these precomputed y^{(m)}y(m)y^{(m)}, I can call h(y,w)h(y,w)h(y,w) which should have fewer allocations and runtime.

To accomplish this goal I have tried to use some prebuilt tools in Symbolics.jl and SymPy (through SymPy.jl). SymPy is a pain for my application because I am taking Symbolics.jl types turning them into SymPy types then back. So there is a lot of plumbing which adds overhead and decreases readability. Furthermore, neither Symbolilics.jl or SymPy gives me the result I am looking for. These tools are not good at isolating one variable from the other. I dev’d Symbolics.jl module to extend the coeff function to handle a broader class of expressions, but it honestly is patch work. I am looking for something more than just subexpressions that are coefficients.

Do you have any suggestions on how to proceed? Thanks again for your time.

That probably needs an issue. I don’t know if we cover that case right now.

Topic		Replies	Views
Symbolics.jl: How to preevaluate a computationally expensive expression in build_function() Modelling & Simulations question , package	6	316	December 2, 2022
Compiling large Symbolics.jl expressions using build_function Modelling & Simulations	4	530	April 17, 2023
How to convert Symbolics.jl expressions to functions while keeping speed? General Usage question , functions , symbolics	2	364	April 30, 2023
A simple tool for common subexpression elimination Community package , announcement	2	1246	January 25, 2017
Generate Julia expressions with Symbolics General Usage symbolics	2	509	February 2, 2022

Detecting Function Composition in Symbolics.jl

Motivation

Goal

Concrete Example (Hindmarsh Rose)

Ideas on how to start

Related topics