Why Julia compilation times are so long in case of functions created using Symbolics?

CaG21 · April 12, 2024, 7:18am

I am relatively new in Julia, so maybe I am missing something trivial.
I created a file containing a function, using the Symbolics package

_, f = build_function(F, x)
write("function.jl", string(f))
myf = include("function.jl")

where F is a symbolic polynomial vector-valued function, and x is a symbolic vector.
When I try to call myf for the first time, Julia takes a lot of time (up to 1h) to give a result. On the other hand, from the second call on, myf only takes some nanoseconds. The same gigantic runtime is needed if I use precompile. So, I think it is the compiler requiring a lot of time to compile myf.
Do you have any clue why this happens? And, in case, how can I mitigate it?

PS: Below, you can find a bit of function.jl, to give an idea of what is inside it

function (ˍ₋out, ˍ₋arg1)
begin
    begin
        @inbounds begin
                    ˍ₋out[1] = 1.0       
                    [...]
                    ˍ₋out[48] = ˍ₋arg1[47]
                    ˍ₋out[49] = (+)(-0.7071067811865475, (*)(0.7071067811865475, (^)(ˍ₋arg1[1], 2)))
                    ˍ₋out[50] = (*)(ˍ₋arg1[1], ˍ₋arg1[2])
                    ˍ₋out[51] = (*)(ˍ₋arg1[1], ˍ₋arg1[3])
                    [...]
                    ˍ₋out[74308] = (*)((*)((*)(ˍ₋arg1[36], ˍ₋arg1[4]), ˍ₋arg1[47]), ˍ₋arg1[6])
                    ˍ₋out[74309] = (*)((*)((+)(-0.7071067811865475, (*)(0.7071067811865475, (^)(ˍ₋arg1[37], 2))), ˍ₋arg1[4]), ˍ₋arg1[6])
                    ˍ₋out[74310] = (*)((*)((*)(ˍ₋arg1[37], ˍ₋arg1[38]), ˍ₋arg1[4]), ˍ₋arg1[6])
                    [...]
                    nothing        
        end
    end
end
end

gdalle · April 12, 2024, 7:49am

Not sure where this comes from, but presumably the expression swell that is sometimes inevitable with symbolic computing. You could try FastDifferentiation.jl, which seems designed to scale a little better

Benny · April 12, 2024, 8:03am

How many _-out[...] = ... lines are there in total? How large do _-out and _-arg1 have to be for this to work?

CaG21 · April 12, 2024, 8:08am

Unfortuntely, _-out can be quite large, up to 100k elements. _-arg1 is a relatively smaller vector, with less than 50 elements

Benny · April 12, 2024, 8:13am

Sounds large enough to be relevant to this then Compile time and memory usage grows quickly with function size · Issue #19158 · JuliaLang/julia · GitHub

gdalle · April 12, 2024, 8:32am

Can I ask why it makes sense to use a symbolic approach with such large vectors? What is the end goal?

CaG21 · April 12, 2024, 9:52am

Symbolics allows me to write automatically the function I need. Each element of out is a product of special polynomials, each with a different variable, and the symbolic approach seems the faster and easier approach to manipulate them.
Just to clairify what I am doing, this is an MWE on how I compute F:

using Polynomials, SpecialPolynomials, Symbolics
n = 4
N = 1000
pmax = 10

A = rand(0:pmax, N, n)

x = Symbolics.variables(:x, 1:n)

H = convert.(Polynomial, basis.(Chebyshev, 0:pmax)) # For clarity's sake, I am using only Chebyshev, but it can be any special polynomial
HH = H[A.+1]
F = prod([HH[i,j](x[j]) for (i,j) in Iterators.product(axes(HH)...)], dims=2)

Any suggestion to improve this approach will be very welcome

CaG21 · April 12, 2024, 9:56am

This could be the answer. Thank you.
So I am wondering, since I am not planning to change this function that often, is there any way to compile the function once for all, and cache it some way, so that Julia does not compile it any time I run it?

greatpet · April 12, 2024, 9:59am

I’ve seen people successfully abusing GCC to compile megabyte-sized math formulas in C. If it’s possible to translate your code to C, I’d try making a shared library with GCC and loading it in Julia.

Daneel · April 12, 2024, 10:20am

You could try putting it in a package and setting it to precompile.

__precompile__(true)

Module Initialization and Precompilation

Edit: Added another link

Benny · April 12, 2024, 11:23am

Assuming this is worth caching to reuse many times between edits, precompile and PrecompileTools.jl for packages, and PackageCompiler.jl are the ways to cache compiled code. A package is necessary to specify the environment and isolate the compiled code to a global scope; there’s no point in reusing cached code that fails to change with the environment or is invalid because of arbitrary code ran before or afterward in the same global scope.

Still, you should continue looking into improving the compile-time or find out if it is even feasible, I wouldn’t know.

ChrisRackauckas · April 12, 2024, 11:32am

Compilation of 100k lines of code functions currently isn’t optimized with type inference. One thing you could do is try the CTarget backend:

I don’t know if that supports ^ operations, but the rest of the polynomial should be handled. So a small contribution to the CTarget generator may be all that’s required to get your case to generate C here, and following that tutorial using GCC you should be able to take a noticable bite out of it.

Barring that, there’s a lot of ongoing work in JuliaSimCompiler around this kind of function building but this exact case is not handled at this point.

CaG21 · April 13, 2024, 1:39pm

Thank you for all the suggestion.
I tried the approach suggested by @ChrisRackauckas (i.e., build the function using CTarget, and compile the code using gcc). It works. If no optimization flag is set, compilation times are about 30 seconds and execution times are acceptable. As soon as I activate any optimization (even only with -O1), compilation time skyrockets again.
I’ll keep investigating.

ChrisRackauckas · April 13, 2024, 2:00pm

Higher than -O1 isn’t helpful. What it’s doing is running an SLP vectorization pass which (a) takes forever on codes with lots of scalarization and (b) does not help in these kinds of codes. In fact, we see it slow down the runtime of these codes in many instances. So keep the optimization level lower, use GCC, and you should be good.

Benny · April 13, 2024, 9:42pm

I wonder if a similar effect can be accomplished with the julia command flags --compile=min or --optimize=0. The docs isn’t detailed enough for me to understand what’s really happening there, and this doesn’t seem practical because it’ll affect the whole session, not a specified call signature. There is Base.Experimental.@optlevel for modules, which is still a bit broad for comfort.

ChrisRackauckas · April 13, 2024, 11:26pm

We are looking into it but it might take awhile to solve

sumiya11 · April 14, 2024, 8:17pm

You can try SLPs from Oscar Introduction · Oscar.jl (oscar-system.org). Though I have not used it myself, and do not know how efficient the compilation is.

Topic		Replies	Views
Compiling large Symbolics.jl expressions using build_function Modelling & Simulations	4	530	April 17, 2023
Symbolics build_function and usage Performance symbolics , runtimegeneratedfunc	1	707	August 5, 2023
Avoiding Repeated Function Compilation General Usage precompilation	11	4628	March 28, 2019
Are there tricks to reduce compile-time for this code? General Usage	4	805	December 7, 2017
How to convert Symbolics.jl expressions to functions while keeping speed? General Usage question , functions , symbolics	2	364	April 30, 2023

Why Julia compilation times are so long in case of functions created using Symbolics?

Related topics