Extremely slow `invoke` when inlined

mateuszbaran · November 22, 2022, 8:16pm

Consider the following code:

function e()
    els = Any[[1], 10.0]
    return rand(els)
end

function f()
    a = e()
    g(a) 
end

function g(a)
    return invoke(*, Tuple{supertype(typeof(a)),Float64}, a, 10.0)
end

The idea is: g is a very generic fallback (in the original code generated by a macro for dozens of functions), e is a type-unstable computation whose result can only be inferred as Any, and f combines both.
Since f is type-unstable, I don’t expect great performance but the the measured time is far worse than what I would expect:

julia> using BenchmarkTools

julia> @btime f()
  260.207 μs (351 allocations: 19.44 KiB)
1-element Vector{Float64}:
 10.0

It can be demonstrated that Julia decided to inline g, which was a very bad decision. After marking it as @noinline the performance is reasonable:

julia> @btime f()
  72.831 ns (6 allocations: 244 bytes)
1-element Vector{Float64}:
 10.0

So, about 3600x faster (I’m using Julia 1.8.1).
Now my question is: is there any way to fix this performance issue other than marking g as @noinline? It is perfectly fine to inline g into type-stable functions, it can even improve performance a little in some cases.

Oscar_Smith · November 22, 2022, 8:38pm

Good find! Julia’s cost model really shouldn’t be trying to inline this.

Oscar_Smith · November 22, 2022, 8:56pm

improve inlining cost analysis for invoke by oscardssmith · Pull Request #47671 · JuliaLang/julia · GitHub fixes it.

mateuszbaran · November 22, 2022, 9:00pm

Thanks, that’s a very quick response and a PR .

Topic		Replies	Views
Why using a mutable struct type argument to create instances creates a 50x slowdown? Performance question , type , function	7	533	June 25, 2023
Inlining and function boundaries Performance	1	471	June 18, 2020
What's the benefit of inlining? New to Julia question	7	16445	December 13, 2017
Why is this small `@inline` function much slower than an equivalent macro? Performance performance , macros , simd , inline	2	901	June 26, 2021
Eliminite overhead of "invoke" in benchmark Performance	7	1019	February 14, 2018

Extremely slow `invoke` when inlined

Related topics