Question – how can I evaluate an expression into a method, compile it, and then directly manipulate the compiled version (lower than AST) to create another slightly modified method faster than re-compilation?
I need to squeeze performance for it, as slightly modified methods will run as part of evolutionary algorithm.
Let’s say I have this function:
(for the sake of simplicity)
function calc(myinput::Float32)::Float32
myplus = Float32(1.0)
mymul = Float32(2.5)
return (myinput + myplus) * mymul
end
I want to generate different versions of it, for different values of myplus and mymul.
Note - have different methods, not one method with them as parameters.
The common way is to build an Expr:
function get_calc(myplus::Float32, mymul::Float32)::Expr
return :(
function calc(myinput::Float32)::Float32
return (myinput + $myplus) * $mymul
end
)
end
It’s even possible to use RuntimeGeneratedFunctions.jl to making callable methods from Expr in runtime without world-age problems.
However, when profiling and running this for 10M times, profiler says that 99% time is spent on compilation. Which makes sense.
Any way to reduce compilation and edit compiled code directly?
Working end-to-end example which I want to run faster:
using RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)
function get_calc(myplus::Float32, plusop::Expr, mymul::Float32, mulop:: Expr)::Expr
return :(
function calc(myinput::Float32)::Float32
v1 = myinput
v2 = $myplus
v3 = $mymul
$plusop # v4 = v1+v2
$mulop # v5 = v4*v3
res = v5
return res
end
)
end
get_calc_func(myplus, plusop, mymul, mulop) =
@RuntimeGeneratedFunction(@__MODULE__, get_calc(myplus, plusop, mymul, mulop), opaque_closures = false)
function find_calc()
bestres = typemin(Float32)
for myplusop in [:(v4 = v1 + v2), :(v4 = v1 - v2), :(v4 = v2 - v1)]
for mymulop in [:(v5 = v4 * v3), :(v5 = v4 / v3), :(v5 = v3 / v3)]
for myplus in range(-100f0, 100f0, length=100)
for mymul in range(-5f0, 5f0, length=30)
mycalc_expr = get_calc_func(myplus, myplusop, mymul, mymulop)
for myinput in range(-10, 10, step=50)
res = mycalc_expr(myinput)
if res > bestres
bestres = res
end
end
end
end
end
end
return bestres
end
@time find_calc()
# 71.370488 seconds (242.33 M allocations: 13.223 GiB, 5.51% gc time, 98.26% compilation time)
Note that plusop and mulop and generic Expr working on any local variable inside calc(), not limited to specific operations, so they can’t be represented by an operation tree etc.