Hey, I am trying to find out if I can help with compile time issues by doing a bunch of profiling on DiffEq. This has been coming up quite a bit and @Datseris keeps bugging me about it, so I am trying to find out what I can do. It was mentioned to me in the Slack that I can profile the timings for compiling specific function signatures using SnoopCompile.jl since it’s snoop data returns a column of times. Using that, I starting doing some profiling on a simple ODE call:
using SnoopCompile
SnoopCompile.@snoop "compiles.csv" begin
using OrdinaryDiffEq
function f(du,u,p,t)
du[1] = p[1]*(u[2]-u[1])
du[2] = u[1]*(p[2]-u[3]) - u[2]
du[3] = u[1]*u[2] - p[3]*u[3]
end
u0 = [1.0,0.0,0.0]
tspan = (0.0,1.0)
p = (10.0,28.0,8/3)
prob = ODEProblem(f,u0,tspan,p)
Base.GC.gc()
sol = solve(prob,Tsit5())
end
At first I did timing on decrease standard allocs · SciML/OrdinaryDiffEq.jl@0867ad8 · GitHub . This was compiles.csv
which can be found here: compiles.csv · GitHub (maybe there’s a way to make Gists handle CSVs better?). Then I started knocking out top compilation hits inside of OrdinaryDiffEq one at a time, leading to the compilesX.csv
files in the Gist. This was mostly to start highlighting the contribution due to different parts of OrdinaryDiffEq.jl, the perform_step!
function which is the core, and how much of it was due to things in Base.
Actionable Results
Some things that did pop up as very high on the timings were
Tuple{typeof(Base.unsafe_copyto!), Array{Float64, 1}, Int64, Array{Float64, 1}, Int64, Int64}
and
Tuple{typeof(Base.throw_boundserror), Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(Base.muladd), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(DiffEqBase.ODE_DEFAULT_NORM)
For the former, it only showed up in one or two of the CSVs near the top (when you sort it by the time column) so I’m not sure if it’s noise, but that is a potential call that could be added to the Base system image? The second one is the boundserror part of the broadcast machinery. @mbauman mentioned that these could be greatly reduced by @nospecialize
ing them.
Now, this is my first foray into this so and what I got is noisy and I still need some help making this more refined. Please guide me to how I can be a helpful source of data here.