Compiling specialized functions for large set of user-passed options


I’m optimizing my package SymbolicRegression.jl, the backend for PySR, a GA-based gradient-free symbolic regression code.

The normal workflow for this package is to configure the options—such as choice of operators, mutation probabilities, choice of algorithm—and then run the search for a long period of time. For example:

options = SymbolicRegression.Options(
    binary_operators=(+, *, /, -),
    unary_operators=(cos, exp),

This options struct configures the search, and gets passed to nearly every function. Because of this, I think it will improve performance to have Julia compile specialized functions specific to every choice of parameter.

I am wondering if there is a way to force Julia to compile every user-defined parameter (defined here) into my functions?

As an example - the tips from @marius311 and @Henrique_Becker on this thread helped a lot with optimizing my equation evaluation: e.g., putting the operator choices into the type:
Options{typeof(binary_operators), typeof(unary_operators)}(...)
, where each set of operators is assumed to be a tuple, results in Julia compiling the operator choices into the equation evaluation. This improves the performance by quite a bit.

Basically, I would like to extend this technique to every single parameter in the options, since they will remain constant or only take on a few different values each run (say if the user launches multiple equation searches). My first idea is to repeat the above technique for every single parameter, like so:

function search(options::Options{T1, T2, T3, ....}) where {T1, T2, T3, ...}
    # Use T1, T2, T3, ... inside this function

but my guess is that there is a cleaner way to do this. Any idea how I could set this up?


1 Like

Quick update:
It seems like I can pass arbitrary data into a type like so:

data = ((*, -, /), (cos, exp), [-1, -1, -1], [-1, -1], 10)
T = Val{Symbol(data)}

data2 = eval(Meta.parse(string(T.parameters[1])))
data == data2 #true

This syntax is probably a Julia sin, but since these are constants in the view of the compiler, maybe this would work? Then I can just have the entire options array in the type, and have the compiler unpack it.

Here’s a full function:

function g(::Val{T}) where {T}
    data = eval(Meta.parse(string(T)))

then I can call it like:

julia> g(Val((exp, cos)))
(exp, cos)

Edit: this seems to be very slow, so is probably not the way to go about this.

I think this will not have the effect desired, for two reasons:

  1. eval is very slow, as you have already discovered. Even so, lets say that it would be viable because you will do this a single time before a lot of computation, you then get the (2) problem.
  2. The result of eval (i.e., data2) is inherently type-unstable: depending on the value of the symbol, you will have a different type of return. Consequently, the rest of the code will be very slow, unless you immediately pass data2 to a function that do all the heavy work, so this function can be specialized for the types obtained (see function barriers). Therefore, in the end, there will be no difference between this and passing the data directly, except by an extra slow setup step.
1 Like

Thanks Henrique!

Do you know if there is a way to force the specialization of a function to a struct’s values like this without eval?

Julia does not specialize over values, only types, so you have to map them to type-space. If your values are isbits types you can use Val (but use Val(value) on the objects and Val{T} in the function signatures, to extract the value back). Also, this would benefit from you keeping them inside Vals all the way to the inner piece of code that actually works over their value. However, note some things:

  1. I am not sure of how much gain you will obtain from this. I believe you will get some, but maybe not enough to justify.
  2. Your first call with each set of different types as parameters will be very slow, because it will recompile everything.
  3. By default, Julia does not specialize over Function subtypes (like the operators you are passing), and Functions are not isbits (if I remember right), maybe just wrapping them in tuples is enough to fool the compiler but I would give a little look at that.