Parsing expressions into functions

Hi, I’m relatively new to Julia but feel like I’m mostly getting the hang of it.

I have a project where I need to parse a large number of arbitrary expressions (e.g. ax^b, a+bx^2, etc.) into a function, but I’m running into significant performance issues with my current implementation. My implementation involves generating an array of anonymous functions which are then parsed into the main function and evaluated, but this evaluation is much slower than if the expressions are explicitly defined in the main function.

A minimal example is as follows (note that this example only involves one form of expression, to allow for the comparison with the explicit definition):

using BenchmarkTools

coefficients = rand(1000)
exponents = rand(1000)
y = zeros(1000)
functions = []

for i in 1:1000
  push!(functions, x -> coefficients[i]*x^exponents[i])
end

function callfunc(functions,x,y)
  for i in 1:1000
    y[i] = functions[i](x)
  end
end

function explicitfunc(coefficients,exponents,x,y)
  for i in 1:1000
    y[i] = coefficients[i]*x^exponents[i]
  end
end

@btime callfunc(functions,1.23,y)
@btime explicitfunc(coefficients,exponents,1.23,y)

The returned btimes are

193.799 μs (7467 allocations: 116.67 KiB)   # callfunc
61.686 μs (0 allocations: 0 bytes)          # explicitfunc

Furthermore, if I then run the for loop in parallel with @threads on a 4 core cpu, the times are 180.434 μs and 17.477 μs, respectively.

I’ve gone through the documentation, performance tips, and online forums as much as I could and I haven’t been able to figure out a way to improve the performance. Is there a way to parse expressions directly into the function and evaluate them in the local scope (e.g. like eval(Meta.parse(“expression”)) for a local scope), or else is there a way to eliminate whatever overhead is causing the slowdown and allocations in the evaluation of the anonymous functions?

Any help would be greatly appreciated.

1 Like

is this due to the fact that

push!(functions, x -> coefficients[i]*x^exponents[i])

are references to the coefficients? I noticed that if you make coefficients and exponents const, they become much closer:

48.650 μs (2489 allocations: 38.89 KiB)                        
16.771 μs (0 allocations: 0 bytes)

Edit:
It looks like if you do this:

const coefficients = rand(1000)
const exponents = rand(1000)
y = zeros(1000)
functions = [x -> coefficients[i]*x^exponents[i] for i in 1:1000]

they will become the same:

17.673 μs (0 allocations: 0 bytes)                             
16.853 μs (0 allocations: 0 bytes)

not sure if this fits your use case though.

1 Like

This is an interesting question, though I’m not sure your test code here accurately reflects your use case? If we go about optimizing it we might well be working toward the wrong thing. What’s the reason you need to do this in practice / what is the structure of the expressions? Do they have a fixed form (eg, only polynomials?) Do they depend on only the input variable x, or can they have side effects?

3 Likes

That’s interesting, and certainly helps, although as I say, it needs to work for arbitrary expressions, so a comprehension like that may not work.

I hadn’t realised that the reference would be retained if I used them in a function definition, I just assumed it would copy the fixed values when it created the function object, so thanks for that.

I am curious why calling from the comprehension is faster than the push! construction and doesn’t require allocations. Is it just because using the push! construction creates an Any type array? I tried constructing a typed array with different function elements, but unless I construct the functions in a single comprehension or in the same for loop they have different “types” e.g

functions = []

for i in 1:4
  if i < 3
    push!(functions, x -> i*x)
  else
    push!(functions, x -> i*x)
  end
end

display(typeof(functions))
display(typeof.(functions))

giving

Array{Any,1}
4-element Array{DataType,1}:
 getfield(Main, Symbol("##9#11"))
 getfield(Main, Symbol("##9#11"))
 getfield(Main, Symbol("##10#12"))
 getfield(Main, Symbol("##10#12"))

If typing is what causes the difference between the push! construction and the comprehension, how would I be able to make sure all the different functions are of the same type (whatever that means for functions)?

EDIT:

I tried a typed array by using convert(Array{typeof(functions[1]),1},functions), but I didn’t see any performance difference. I guess I need to look up how comprehensions work in detail.

The main difference between the test code and the use case is that the input functions would need to be pretty arbitrary in nature. Ideally I would like to evaluate local variables in the main function, then insert a list of expressions (read from a file) which can directly call those local variables in the main function environment.

e.g. in pseudocode

expressions = ["a*x1^b", "c + exp(d*x2)", "e*x^1 + f*sin(x2)"]
# where a,b,c,d,e,f are just numerical values that would be in the read file
y = zeros(length(expressions))

function mainfunction(expressions,y,t)
  x1 = somefunc(t)
  x2 = somefunc2(t)

  for i in 1:length(expressions)
    y[i] = local_eval(Meta.parse(expressions[i]))
  end
end

but I can ultimately achieve the same functionality as long as I can parse an array of functions into mainfunction with little or no overhead.

For context, I’m trying to solve a set of ODEs using DifferentialEquations.jl, with ~200 ODEs each composed of ~2000 terms, where each term would be an arbitrary expression read from a file(s). I’m trying to define the ode(du,u,p,t) function so that it can evaluate as efficiently as possible, because it’s at the base level and will be called thousands or millions of times.

Consider using FunctionWrappers or @generated.

If you are new to meta-programming then probably FunctionWrappers is the way to go. Create lambdas, then wrap them (see docs) and then pass an array of wrapped functions. That way you avoid any type instabilities.

@generated functions I finds are more performant but more difficult to debug. In some ways they are also more flexible.

1 Like

Ok, this is really useful to know. How are the arbitrary expressions generated? Are they truly arbitrary or is there some structure we can exploit? 2000 terms is an awful lot. Are you generating those with a CAS for some reason? Are you sure there’s not better ways to do what you want?

Is it 200 coupled ODEs to be solved together? In that case, I’d potentially make one giant Expr to define the ode right hand side function and get julia to eval that to generate code before the solve. This should be easy with Meta.parse and a bit of light Expr manipulation.

2 Likes

The system is a set of reactions in a plasma simulation which for several reasons is not quite compatible with the existing Chemical Reactions Models suite, in DiffEqBiological.jl, so I’m writing my own code for it. There are ~200 coupled species, and ~2000 reactions which will each involve ~2-6 different species as reactant/products. The terms which I need to evaluate at each time step are the reaction rates which can be arbitrary functions of the system parameters (most follow the Arrhenius equation form, but many do not due to the physics of the situation, and therefore it needs to work for truly arbitrary expressions).

I have indeed attempted to generate one giant global Expr to define the ODEs using string manipulation and Meta.parse with eval, but the compilation time ended up taking exceedingly long, and it grew as n(reactions)^2. Although that may possibly have just been a crappy implementation on my part, but it was effectively just a massive string version of what the code would look like if instead of writing a for loop, I had written each line by hand, so I don’t know how that could have gone wrong…

EDIT:
Using the above test example, if I just define the function in string form and evaluate that, then the performance is identical to the explicit for loop case, but the first call compilation time takes hundreds of times longer and uses reams of memory. e.g.

funcstring = ["function stringfunc(x,y)"]
for i in 1:1000
  a = coefficients[i]
  b = exponents[i]
  push!(funcstring,"y[$i] = $a*x^$b")
end
push!(funcstring,"end")
eval(Meta.parse(join(funcstring,"\n")))

for first calls this results in

0.026416 seconds (47.27 k allocations: 2.362 MiB)     #callfunc
0.023658 seconds (44.09 k allocations: 2.287 MiB)     #explicitfunc
3.652088 seconds (8.37 M allocations: 425.670 MiB)    #stringfunc

subsequent btime gives

117.718 μs (2490 allocations: 38.94 KiB)  #callfunc
62.715 μs (1 allocation: 64 bytes)        #explicitfunc
62.715 μs (1 allocation: 16 bytes)        #stringfunc

This wouldn’t be altogether impossible to work with, but any optimisation gains would likely be pointless due to the first call overhead.

1 Like

Thanks for the explanations. Would it be possible for you to share a small test case describing a few reactions involving a few species and if possible taking various mathematical forms? In particular, I’m curious to see how input data are formatted.

1 Like

Unfortunately I don’t really have much in the way of fully formatted test cases, because I’m trying to work out how the lowest level behaviour (e.g. parsing some form of expression/function into the odefunction) is going to work, before building the rest of the pipelines from the reaction/species databases, and how everything fits together.

Basically I would read the reaction and reaction rate expression from a database, use a parser which I’ve already coded to identify which species are involved in the reaction and in what way, and generate an appropriate string form for the expression. This would result in expressions of the form 1.54e-16*Te^3.12*exp(-7.56/Te)*u[1]*u[2] or 5.6e3*E_wall*u[1], where the electron temperature Te is solved for or provided, the local electric field at the wall E_wall is a system variable which may or may not be a function of time evaluated in the odefunction, and u[1],u[2] are the densities of the species being solved for which are involved in the reactions, for example.

I would then want to parse these expressions (either as strings, :(expressions), or functions) into the odefunction to be evaluated in the local environment, with the locally defined Te, E, etc.

I feel the overall behaviour that I want is fairly well represented in the pseudo code I provided in #5

Ultimately I either need to be able to parse some arbitrary array of functions which are as performant as the comprehension described in jerryling315’s response, or I need to be able to do the equivalent of eval(Meta.parse("expression")) in a local function scope.

Why can’t you hit global scope in between the eval and calling of the ODE solver?

str = "x[1]*x[2]^2" # read from file
@eval f(x) = $(Meta.parse(str)) # define the function

# pass `f` to some routine

Otherwise, you can do as you do, but need to use invokelatest.

Each function has a different type so this won’t help. You need to use FunctionWrappers.jl for that usecase.

2 Likes

This is definitely problematic; you can’t use eval in this way (as you’ve found out), and any alternative which you managed to write would be horribly slow. There’s a reason that Julia doesn’t support eval in local scope :wink:

I’ve seen this problem before (or maybe I encountered it myself). IIRC it was because type inference really doesn’t like huge expressions. In particular I recall that huge explicit sums were a problem so it might help to break your 2000 term expressions down into, say, 200 ten term expressions.

But overall I suspect you’re going to have to exploit the structure of the problem and write something more like explicitfunc where the equations are of a fixed set of forms, and driven by tables of coefficients.

I also wonder whether we are still missing something here in terms of why a compilation time of 1s (or so) for the ODE right hand side matters, vs the many times you’ll presumably need to call this function when actually integrating the equations. Do the equations themselves change during the ODE integration?

1 Like

Kristoffer, that’s pretty much exactly what I will do, but the issue isn’t going from string to function in the global scope, it’s that once all 2000 functions are defined, parsing them as an array into the main function and evaluating them for at a given value of time has some sort of overhead. As shown by jerryling315 in post #2, whatever this overhead is can be avoided if the array of functions is generated in a comprehension, but I need to investigate further to see if I could write my arbitrary functions in terms of a comprehension. The real question is why does calling a function from a push! generated array incur so much overhead compared to calling it from a comprehension generated array. I don’t really know much about comprehensions, so that’s something I’ll look into tomorrow.

Chris, I don’t really understand why there’s no local equivalent of eval(), I thought the idea of meta programming is that julia code starts out as effectively just a string, which then gets turned into expressions and evaluated, what would be the issue with the following

function explicit(a,b,x,y)
  y = a*x+b
end
explicit(1.23,4.56,7.89,y)

expression = :(1.23*x + 4.56)
function evaluated(expression,x,y)
  y = local_eval(expression)
end
evaluated(expression,7.89,y)

Where you’re effectively just filling in what the code would be in an explicit case. Anyway that’s not fundamentally important, if it can’t be done in julia, it can’t be done. Mostly just a curiosity at this point.

The main issue with the first call time is that the real full expressions are larger and more involved than the toy examples, so when I tried to string generate the full odefunction, there were lines with thousands of characters and even with a 20 species, 200 reaction set, it was taking tens of seconds to compile, and growing as ~n(reactions)^2. But I could certainly try some frankenstein combination of generated explicit code and some for loop sections to see if I can balance the different bottlenecks.

There are use cases where the equations themselves would need to change throughout the overarching solution procedure, requiring new first calls, so I would like to keep that option open.

I think if we can just identify why the comprehension generated array of functions produces no overhead, while my push! generated array of functions does have an overhead, then I think that would be perfect, given that local_eval isn’t an option.

EDIT:
My bad, I completely forgot that the equations themselves would need to be updated fairly frequently because for many of them, the form of the equations depends on something called the electron energy distribution which is calculated on the fly by calling a pre-existing (and industry standard) fortran package. So yeah, I definitely need to keep the compile time to a minimum.

Yes, add parenthesis. DiffEqBiological.jl adds parenthesis to make all of the additions binary operations because a + of 1000 terms will splat and cause both compile and runtime issues.

2 Likes

It’s because the compiler is able to infer the element type of the comprehension generated array as a concrete type. Try calling eltype() on those two different functions arrays. If you generate it using push! you’ll see that the element type is Any. For the comprehension version it’s the type of the anonymous function (which is probably an unfortunately confusing name such as getfield(Main, Symbol("##8#10"))). When the compiler can get concrete type information it generates much better code. However, this optimization is only relevant if all your expressions have the same form; that is, if they are generated from the same (anonymous) function expression in the source code.

In practice, you’ll have varying forms for the functions in your array so they’ll all have different types. The solution to that problem is to hide this unwanted type information behind a compiled function pointer, which I believe is what https://github.com/yuyichao/FunctionWrappers.jl does (already mentioned up thread by Kristoffer). If you do need to solve the problem with an array of functions, this is the way to go.

However, I’m still not convinced that an array of functions is what you really need. Just adding a lot of parentheses to your expressions might do.

1 Like

If there was a local version of eval, this would mean that the set of local variables within a function can’t be known at compile time. If the compiler can’t reason about local variables and their types, it can’t generate fast code.

2 Likes

Which Fortran library? Is it possible to just call this Fortran code to produce the EEDF into a lookup table as necessary, and pass that into your functions which compute reaction rate? That way you won’t actually need to recompile the Julia code.

I’m using the BOLSIG solver (which is an already compiled executable) for the EEDF, and you can certainly generate a lookup table when there’s only a couple of species and you just vary the ratio, but there’s too many species that will at varying times be significant fractions of the total, as well as other system parameters that can affect the EEDF, so it would be impractical to cover enough parameter space with a lookup table.

I think if FunctionWrappers.jl can get rid of some or all of the function call overhead, then that would work perfectly. Could someone direct me to somewhere with instruction on how to actually use FunctionWrappers.jl? There’s nothing in the readme on github and no documentation in the package…

EDIT:
Okay, I managed to find some other people’s code who had used functionwrappers and worked backwards from that. It reduced the overhead to just 20%, with 0 allocations, and correct scaling in parallel. So all in all I think I’m pretty happy with that solution.

it would be impractical to cover enough parameter space with a lookup table

Sure, what I mean is to call BOLSIG dynamically as your simulation progresses to update the current EEDF. By “lookup table” I really mean whatever interpolated form of the distribution function it is that you need in your rate equations. BOLSIG+ docs says it gives you rate coefficients k_i and if that’s all you need I’d just pass the array k[i] of the current values to your ode functions rather than compiling a new version each time.

BOLSIG takes a given mean electric field and reaction chemistry as input and then returns the rate coefficients evaluated for that specific mean electric field (among other things) for the 2-body electron reactions, In my system the mean electric field is a function of time, so each time I call BOLSIG, I generate a lookup table of mean electric fields and their corresponding rate coefficients, I then fit those rate coefficients as a function of mean electric field, and parse those expressions to the ode function. As the composition of the plasma evolves, the original EEDF is no longer accurate and BOLSIG gets called again with the new composition to update everything. Then in addition to that there are many reactions which aren’t parsed to BOLSIG (e.g. 3-body electron reactions, and heavy species reactions) as well as other generation and loss process (e.g. diffusion, inflow, outflow, wall flux, etc.) each with different forms and dependencies that all need to be updated each time step in the ode function.

While I certainly agree that it would be feasible to take the generated code approach, I just think that a) the initial compilation time would be relatively long and not worth the execution performance compared to the FunctionWrappers solution, and b) the code itself would be more of a mishmash, more opaque, less intuitive, and less user friendly, which will make it more difficult to code, debug, and add functionality to down the line, and more difficult for colleagues to modify to suit their needs if I share the code.