I am currently profiling the search speed of SymbolicRegression.jl. In particular I am attempting to speed up this PR, which integrates the package with DynamicExpressions.jl.
I am using the following code for profiling:
using SymbolicRegression
X = randn(Float32, 5, 100)
y = 2 * cos.(X[4, :]) + X[1, :] .^ 2 .- 2
options = SymbolicRegression.Options(;
binary_operators=(+, *, /, -),
unary_operators=(cos, exp),
npopulations=20,
max_evals=30000,
)
hall_of_fame = EquationSearch(X, y; options=options, multithreading=false, numprocs=0);
@profview begin
for i=1:10
EquationSearch(X, y; niterations=40, options=options, multithreading=false, numprocs=0);
end
end
which should act as a basic measure of the search speed over 10 random initializations.
Now, when I look at the flamegraph and zoom in, I see this:
Many of these calls makes sense - I expect most of the time to be spend doing the actual evaluation of expressions. However, a significant percentage of the time (~30%) is spend on type inference functions typeinf_ext_toplevel
and their children.
When I click on one of these and zoom in, I am not given any answers, as none of these functions mention where in my library the type inference is being done:
So, my question is: how should I figure out where this type inference is actually occuring, so I am able to patch it, and get the (hopefully) 30% speedup?
Thanks!
Miles