I’m trying to speed up function tracing in Umlaut.jl. According to the profiler, about 30% of time is spent in a generic function mkcall()
, which records the call, does some argument transformation and then simply invokes the original function:
function mkcall(fn::Any, args::Vararg{Any}; kwargs...)
fn_, args_ = ...
return fn_(args_...)
end
Within mkcall()
invokation, most time is spent in abstract interpretation & type inference. If I understand it correctly, this is due to Julia compiler specializing mkcall()
for each combination of function and arguments:
julia> using MethodAnalysis
julia> methodinstances(mkcall)
7-element Vector{Core.MethodInstance}:
MethodInstance for Umlaut.mkcall(::typeof(_getfield), ::Variable, ::Int64)
MethodInstance for Umlaut.mkcall(::typeof(map), ::typeof(unthunk), ::Variable)
MethodInstance for Umlaut.mkcall(::typeof(tuple), ::Variable, ::Variable)
MethodInstance for Umlaut.mkcall(::Function, ::Variable, ::Vararg{Variable})
MethodInstance for Umlaut.mkcall(::Function, ::Function, ::Vararg{Any})
MethodInstance for Umlaut.mkcall(::typeof(tuple), ::Vararg{Any})
MethodInstance for Umlaut.mkcall(::Function, ::Variable, ::Vararg{Any})
So I tried to avoid excessive compilation using @nospecialize
as well as turning off inlining and constant propagation as suggested here:
@noinline Base.@constprop :none function mkcall(fn::Any, args::Vararg{Any}; kwargs...)
@nospecialize
fn_, args_ = ...
return fn_(args_...)
end
If I then invoke mkcall
a few times manually, everything works as expected and Julia generates only one specialization. But when I run it on a real case (specifically, tracing Metalhead.ResNet(18)
), I still get a lot of method instances:
julia> methodinstances(mkcall)
6-element Vector{Core.MethodInstance}:
MethodInstance for Umlaut.mkcall(::typeof(_getfield), ::Variable, ::Int64)
MethodInstance for Umlaut.mkcall(::Any, ::Any)
MethodInstance for Umlaut.mkcall(::typeof(map), ::typeof(unthunk), ::Variable)
MethodInstance for Umlaut.mkcall(::typeof(tuple), ::Variable, ::Variable)
MethodInstance for Umlaut.mkcall(::typeof(tuple), ::Vararg{Any})
MethodInstance for Umlaut.mkcall(::Any, ::Any, ::Vararg{Any})
Why @nospecialize
doesn’t work in this case? Is there a better way to speed compilation?