Question – how can I evaluate an expression into a method, compile it, and then directly manipulate the compiled version (lower than AST) to create another slightly modified method faster than re-compilation?
I need to squeeze performance for it, as slightly modified methods will run as part of evolutionary algorithm.
Let’s say I have this function:
(for the sake of simplicity)
function calc(myinput::Float32)::Float32
myplus = Float32(1.0)
mymul = Float32(2.5)
return (myinput + myplus) * mymul
end
I want to generate different versions of it, for different values of myplus
and mymul
.
Note - have different methods, not one method with them as parameters.
The common way is to build an Expr:
function get_calc(myplus::Float32, mymul::Float32)::Expr
return :(
function calc(myinput::Float32)::Float32
return (myinput + $myplus) * $mymul
end
)
end
It’s even possible to use RuntimeGeneratedFunctions.jl to making callable methods from Expr in runtime without world-age problems.
However, when profiling and running this for 10M times, profiler says that 99% time is spent on compilation. Which makes sense.
Any way to reduce compilation and edit compiled code directly?
Working end-to-end example which I want to run faster:
using RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)
function get_calc(myplus::Float32, plusop::Expr, mymul::Float32, mulop:: Expr)::Expr
return :(
function calc(myinput::Float32)::Float32
v1 = myinput
v2 = $myplus
v3 = $mymul
$plusop # v4 = v1+v2
$mulop # v5 = v4*v3
res = v5
return res
end
)
end
get_calc_func(myplus, plusop, mymul, mulop) =
@RuntimeGeneratedFunction(@__MODULE__, get_calc(myplus, plusop, mymul, mulop), opaque_closures = false)
function find_calc()
bestres = typemin(Float32)
for myplusop in [:(v4 = v1 + v2), :(v4 = v1 - v2), :(v4 = v2 - v1)]
for mymulop in [:(v5 = v4 * v3), :(v5 = v4 / v3), :(v5 = v3 / v3)]
for myplus in range(-100f0, 100f0, length=100)
for mymul in range(-5f0, 5f0, length=30)
mycalc_expr = get_calc_func(myplus, myplusop, mymul, mymulop)
for myinput in range(-10, 10, step=50)
res = mycalc_expr(myinput)
if res > bestres
bestres = res
end
end
end
end
end
end
return bestres
end
@time find_calc()
# 71.370488 seconds (242.33 M allocations: 13.223 GiB, 5.51% gc time, 98.26% compilation time)
Note that plusop
and mulop
and generic Expr working on any local variable inside calc()
, not limited to specific operations, so they can’t be represented by an operation tree etc.