Executing a vector of expressions: some benchmarks

A code generates a vector of expressions.
What is the fastest way to execute them as a function?
Below are some ideas with their benchmarks.

@btime fun_func($a)            # 6.500 ns (0 allocations: 0 bytes)
@btime fun_blk($a)             # 278.788 ns (0 allocations: 0 bytes)
@btime fun_foreach($evec, $a)  # 541.900 μs (94 allocations: 4.88 KiB)
@btime fun_eval($evec, $a)     # 105.889 ns (1 allocation: 32 bytes)
@btime fun_file($a)            # 5.300 ns (0 allocations: 0 bytes)

So, looking only at runtimes, writing a function to a file, and including it is fastest.
How can I avoid writing to a physical file, e.g. using IOBuffer?
Alternatives?

using BenchmarkTools

a = [1,1,0,0]
evec = [:(a[3] = a[1] + a[2]), :(a[4] = a[2] + a[3]), :a]

# as function
function fun_func(a)                 
    @inbounds a[1] + a[2]
    @inbounds a[4] = a[2] + a[3]
    a
end

# turn vector of expressions into a function
function genfun(expr, args::Tuple, gs=gensym())
    eval(Expr(:function,Expr(:call, gs, args...),expr))
    (args...) -> Base.invokelatest(eval(gs), args...)
end
eblk = Expr(:block,evec...)
fib = genfun(eblk, (:a,))
fun_blk = genfun(eblk, (:a,))

# run evec using foreach
function fun_foreach(evec, a)        
    @inbounds foreach(eval, evec)
end

# run evec directly using eval
function fun_eval(evec, a)           
    @inbounds eval(evec)
end

# evec -> String -> file -> include
svec = string.(collect(evec))
pushfirst!(svec, "function fun_file(a)")
push!(svec, "a")
push!(svec, "end")
filename = "fun_file.jl"
open(filename,"w") do f
    for s in svec
        println(f, s)
    end
end
include(filename)

@btime fun_func($a)            # 6.500 ns (0 allocations: 0 bytes)
@btime fun_blk($a)             # 278.788 ns (0 allocations: 0 bytes)
@btime fun_foreach($evec, $a)  # 541.900 μs (94 allocations: 4.88 KiB)
@btime fun_eval($evec, $a)     # 105.889 ns (1 allocation: 32 bytes)
@btime fun_file($a)            # 5.300 ns (0 allocations: 0 bytes)

Don’t do this if you care about performance. Find another way to accomplish what you want.

In the case of fun_file, you only timed the cost of evaluating fun_file after it was compiled (because @btime runs the code multiple times and prints the minimum time), whereas in all the other cases you timed the cost of eval (which includes code generation).

The analogous thing, without using a file, would be to first eval a function expression once that includes all your generated expressions, and then subsequently call that function in @btime. But that won’t work if you are generating new expressions at runtime over and over. (Unless you call the expressions many times for each time they are generated, in which case the cost of eval-ing a function once may be negligible.)

Except in very narrow circumstances (e.g. you are writing a Julia interpreter GUI), doing runtime eval is a good sign that you need to re-think your approach.

2 Likes

Thanks @stevengj!

Don’t do this if you care about performance. Find another way to accomplish what you want.

My use case comes from mapping a sparse matrix code to Verilog, which only has vectors.
This code is being converted from Matlab to Julia. For debugging reasons, it shall also run in Julia.

Unless you call the expressions many times once they are generated

Exactly, generated only once at the start, like in some old circuit simulators.

doing runtime eval is a good sign that you need to re-think your approach

The generate / include file approach works as the original function, without eval.
I was struggling with IOBuffer’s read and write, any hint how to use it, as with the file?

Don’t work with strings—almost never build expressions using strings. Work with expressions. Like this, for example

genfun_expr(exprs, args...) =
    Expr(:function, Expr(:tuple, args...), Expr(:block, exprs...))

myfunc = eval(genfun_expr(evec, :a))
@btime $myfunc($a)   # 2.863 ns (0 allocations: 0 bytes)
1 Like

As in the example above I do use Expr to assemble the serial operations (full loop unrolling and filtering of zero-ops).
Strings are only used for the first and last line of the function, and to print the function body of course. OK?

No you’re not, because you’re also unparsing the expressions into strings and concatenating them. Why output a string, then re-parse it, then eval, when you can can just generate and evaluate an expression directly as in my genfunc_expr example?

(Outputting and parsing strings is generally much less reliable than expressions for metaprogramming because strings lose a lot of information and may lead to unexpected results when composed.)

As said before, the file approach does not use eval() and is as fast (because its the same) as hand-written code.

you can can just generate and evaluate an expression

fun_file is 20x faster than fun_eval - suggestions to add to the benchmarks?

The Problems

The point others are making here is that this is an apples-and-oranges comparison. Your fun_foreach and fun_eval functions are operating on raw expressions, while the others have the benefit of already processing your expressions before they are called.

In your case, fun_func and fun_file are literally identical and any benchmarking differences are coincidental. The only difference being that fun_file went on a roundabout trip through your harddrive as a file first (but that part was excluded from benchmarking). If you had actually timed the process of writing and includeing the fun_file version, it would be extremely slow by comparison.

You have a typo in fun_func where you don’t assign to a[3] in the first line.

You have a typo where you pass evec rather than eblk to fun_eval.

eval does not work how you think. Any expression that is eval’d is evaluated in GLOBAL scope of the current module. That means that your attempt to pass a to fun_foreach and fun_eval is fruitless. The fact that they work at all is merely a coincidence resulting from a being a global variable (and thus accessible in the global scope where the eval occurs).

For the scoping reasons I just discussed, I am virtually certain the @inbounds annotations in fun_foreach and fun_eval are not doing anything. But a few missing inbounds annotations are not really the biggest issues here.

Solutions

As suggested by others: if you are writing functions to call them once, you should forego any hopes of “fast” run times in Julia. If you’ll be calling them many times, you can mostly get away with this. However, I would encourage you to figure out whether you can avoid building expressions (or worse, strings) to call in eval altogether.

Your genfun approach would be my preference (well, fun_func is actually but you seem to be indicating that isn’t viable since you’re generating expressions). Part of the poor performance of fun_blk is a benchmarking artifact of the fact that fun_blk is a variable (that is bound to an anonymous function) rather than a function like all the others. As such, you need to interpolate it into the benchmarking expression.

I’ve taken the liberty of updating your tests, although I dropped fun_foreach and fun_eval because of the not-actually-using-a issue.

using BenchmarkTools

a = [1,1,0,0]
evec = [:(@inbounds a[3] = a[1] + a[2]), :(@inbounds a[4] = a[2] + a[3]), :a]

# as function
function fun_func(a)                 
    @inbounds a[3] = a[1] + a[2]
    @inbounds a[4] = a[2] + a[3]
    a
end

function genfun(expr, args::Tuple, gs=gensym())
    eval(Expr(:function,Expr(:call, gs, args...),expr))
    (args...) -> Base.invokelatest(eval(gs), args...)
end

function genfun_alt(expr, args::Tuple, gs=gensym())
    fun = eval(Expr(:function,Expr(:call, gs, args...),expr))
	(args...) -> Base.invokelatest(fun, args...)
end

function genfun_noinvoke(expr, args::Tuple, gs=gensym())
    fun = eval(Expr(:function,Expr(:call, gs, args...),expr))
end

eblk = Expr(:block,evec...)
fun_blk = genfun(eblk, (:a,))
fun_blk_alt = genfun_alt(eblk, (:a,))
fun_blk_noinvoke = genfun_noinvoke(eblk, (:a,))

# fun_func is a function (global const) so interpolation is unnecessary
# fun_blk and friends are global variables so need to be interpolated
@btime fun_func($a)          # 6.000 ns (0 allocations: 0 bytes)
@btime $fun_blk($a)          # 149.095 ns (0 allocations: 0 bytes)
@btime fun_blk($a)           # 171.982 ns (0 allocations: 0 bytes) <- not interpolated
@btime $fun_blk_alt($a)      # 34.038 ns (0 allocations: 0 bytes)
@btime $fun_blk_noinvoke($a) # 6.400 ns (0 allocations: 0 bytes)

Notice that I added the @inbounds annotations directly to the relevant expressions. I defined an alternative genfun_alt that avoids an excess eval and a genfun_noinvoke that does not use invokelatest. There are technical reasons invokelatest can be necessary, but note that skipping it saves 150ns of overhead. I’m not really qualified to explain when invokelatest might and might not be required, but it has something to do with world age. Unfortunately, I vaguely suspect that your stated use case (rather than this demo) might be one of those cases (the same issue would also thwart your fun_file).

2 Likes

As I explained, fun_file is not actually calling eval (not invoking the compiler) during the benchmark. fun_eval is.

Thanks @stevengj and @mikmoore for bearing with me,
for debugging my code and calling the benchmarking fallacies.

fun_blk_noinvoke() is the one.

I found the code with the now omitted line in https://discourse.julialang.org/t/how-do-i-create-a-function-from-an-expression. Is there a difference in version or usage?

There’s no need to give the function a name with gensym(). You might as well just make it anonymous as in my example above.

1 Like

Here is an example where invokelatest is required. I imagine this is more like what you’ll be doing in practice.

genfun(expr, argnames) = eval(Expr(:function, Expr(:tuple, argnames...), expr))

function makeandcalllatest(expr,argnames,argvals)
	# make the function
	fun = genfun(expr,argnames)
	# using invokelatest, call it within the same world that it was created
	Base.invokelatest(fun,argvals...)
end

function makeandcall(expr,argnames,argvals)
	# make the function
	fun = genfun(expr,argnames)
	# call it within the same world that it was created
	fun(argvals...)
end

makeandcalllatest(:(x+y+1),(:x,:y),(1,2))
# returns 4
makeandcall(:(x+y+1),(:x,:y),(1,2))
# ERROR: MethodError: no method matching (::var"#6#7")(::Int64, ::Int64)
# The applicable method may be too new: running in world age 31332, while current world is 31333.

In the earlier demonstrations, the fact that we were coming back to the REPL after every operation allowed the world age to advance. In the MWE I created here, there is no such opportunity – a world age error occurs because the function we’re trying to call didn’t exist when we made the call to makeandcall. There are important technical reasons for these world age issues (Julia doesn’t throw this error to be funny) but someone else would have to explain them because I don’t know.

Go ahead with the non-invokelatest solution for now. If you run into world age issue I produced here, you’ll need to use invokelatest. The overhead of invokelatest appears to be relatively small so (unless these functions are very simple and are called millions of times each) the runtime will likely be dominated by other parts (genfun is quite expensive). If you ultimately want to avoid invokelatest (assuming that’s even possible), you’ll need someone more knowledgeable than me to help you.

1 Like