Pair{Symbol, Function} slower than equivalent non-parametric type

Consider a list of symbol and functions:

julia> macro big_closure_list(T, n)
              esc(:($T[$([:($T(:x, ()->$(Symbol(:v, i)))) for i in 1:n]...)]))

julia> @macroexpand @big_closure_list(Pair, 2)
:(Pair[Pair(:x, (()->(v1))),
       Pair(:x, (()->(v2)))])

Performance of this expression depends a lot on the type I put in for the T:

ulia> @time @big_closure_list(Pair, 20);
  0.161013 seconds (62.34 k allocations: 3.871 MiB, 98.59% compilation time)

julia> @time @big_closure_list(Pair{Symbol, Function}, 20);
  0.082167 seconds (27.81 k allocations: 1.701 MiB, 97.93% compilation time)

julia> struct Foo

julia> @time @big_closure_list(Foo, 20);
  0.000166 seconds (387 allocations: 26.516 KiB)

(all times are from a second run, for JIT-warm-up)

I can sorta-understand the first one being slower, with the type being abstract… Although I’d love to hear a detailed explanation. But the difference between Pair{Symbol, Function} and Foo is surprising to me. Aren’t these two supposed to be entirely equivalent? Furthermore, MyPair is fast!

julia> struct MyPair{A,B}

julia> @time @big_closure_list(MyPair{Symbol, Function}, 20);
  0.000163 seconds (387 allocations: 26.516 KiB)

What does Pair do to cause that slow performance? It seems to be the inner constructor.

julia> struct MyPair2{A, B}
           MyPair2{A, B}(a::A, b::B) where {A, B} = new(a, b)

julia> @time @big_closure_list(MyPair2{Symbol, Function}, 20);
  0.084144 seconds (28.68 k allocations: 1.729 MiB, 98.69% compilation time)

Shouldn’t that constructor be equivalent to the automatic one from MyPair?

1 Like

Are you sure? The slow examples are reporting 98% compile time.

Maybe the compiler is specialising differently? Maybe use @nospecialize in the inner constructor?

Yes, I’m sure. Obviously, even otherwise warmed-up, there is compilation involved because we’re creating the brand new closures with this code. The question is: where is this compilation time coming from, that is so much worse on Pair than MyPair?

Incidentally, putting the code in a function and calling the function is faster!

julia> @time begin
       f() = @big_closure_list(MyPair2{Symbol, Function}, 20);
  0.029103 seconds (70.70 k allocations: 4.129 MiB, 99.29% compilation time)

julia> @time @big_closure_list(MyPair2{Symbol, Function}, 20);
  0.082526 seconds (28.67 k allocations: 1.746 MiB, 98.67% compilation time)
julia> Pair |> isconcretetype

julia> Pair{Symbol,Function} |> isconcretetype

Pair is a UnionAll, meaning the resulting vector of @big_closure_list would be able to hold any Pair - not just Pair{Symbol,Function}. The compiler then doesn’t know the size of each element, has to box & allocate it seperately etc. This is not the case for Pair{Symbol,Function} and Foo. In addition to that, the compiler is probably running inference again and again for Pair{Symbol,Function}, since Function is an abstract type and there’s a chance (though we know there isn’t from our reading) that the type of the resulting vector may be different. It could become a Union of two or more different Pairs as well, no? This is never the case for Foo - it has no type parameters after all. There may also be some stuff with invalidations going on, but I admittedly haven’t checked that.

May I ask how you encountered this?

1 Like