Extremely slow `invoke` when inlined

Consider the following code:

function e()
    els = Any[[1], 10.0]
    return rand(els)
end

function f()
    a = e()
    g(a) 
end

function g(a)
    return invoke(*, Tuple{supertype(typeof(a)),Float64}, a, 10.0)
end

The idea is: g is a very generic fallback (in the original code generated by a macro for dozens of functions), e is a type-unstable computation whose result can only be inferred as Any, and f combines both.
Since f is type-unstable, I don’t expect great performance but the the measured time is far worse than what I would expect:

julia> using BenchmarkTools

julia> @btime f()
  260.207 μs (351 allocations: 19.44 KiB)
1-element Vector{Float64}:
 10.0

It can be demonstrated that Julia decided to inline g, which was a very bad decision. After marking it as @noinline the performance is reasonable:

julia> @btime f()
  72.831 ns (6 allocations: 244 bytes)
1-element Vector{Float64}:
 10.0

So, about 3600x faster (I’m using Julia 1.8.1).
Now my question is: is there any way to fix this performance issue other than marking g as @noinline? It is perfectly fine to inline g into type-stable functions, it can even improve performance a little in some cases.

1 Like

Good find! Julia’s cost model really shouldn’t be trying to inline this.

improve inlining cost analysis for invoke by oscardssmith · Pull Request #47671 · JuliaLang/julia · GitHub fixes it.

3 Likes

Thanks, that’s a very quick response and a PR :slightly_smiling_face: .

2 Likes