Benchmarking surprises with type assertions

sijo · March 15, 2024, 1:15pm

While looking at another post I got surprised by the following differences when using type assertions:

using Chairmarks

tup = (10, 100im, 1.0, 2//3, 3.0im, 2//3+im)
idx = 5
type = typeof(tup[idx])

function test1(tup, idx, type::Type{T}) where T
    idx == 1 && return tup[1]::type
    test1(Base.tail(tup), idx-1, type)
end

# Just replace T with type in the function body:
function test2(tup, idx, type::Type{T}) where T
    idx == 1 && return tup[1]::T
    test2(Base.tail(tup), idx-1, T)
end

# Remove the type assertion:
function test3(tup, idx)
    idx == 1 && return tup[1]
    test3(Base.tail(tup), idx-1)
end

julia> @b test1($tup, $idx, $type)
19.468 ns (2.01 allocs: 64.170 bytes)

julia> @b test2($tup, $idx, $type)
147.246 ns (2.04 allocs: 145.173 bytes)

julia> @b test3($tup, $idx)
8.415 ns (1.00 allocs: 32.077 bytes)

Which leads to two questions:

For a function that takes a parameter type::Type{T}, is it expected to have much worse performance when using T instead of type in the function body?
Is it expected that a type assertion allocates? Or where’s the additional allocation coming from in the first two cases?

Note that I find the same results with BenchmarkTools.jl:

using BenchmarkTools

julia> @btime test1($(Ref(tup))[], $(Ref(idx))[], $(Ref(type))[]);
  16.249 ns (2 allocations: 64 bytes)

julia> @btime test2($(Ref(tup))[], $(Ref(idx))[], $(Ref(type))[]);
  108.330 ns (2 allocations: 144 bytes)

julia> @btime test3($(Ref(tup))[], $(Ref(idx))[]);
  7.399 ns (1 allocation: 32 bytes)

nsajko · March 16, 2024, 7:15am

I guess that dereferencing a type prevents type inference, causing run time dispatch.

sijo · March 16, 2024, 8:16am

I used that trick to prevent constant propagation/folding (as suggested at the bottom of this doc page). I’m not sure of the consequences but I don’t think that’s the problem here: I don’t deference when using Chairmarks, and still get the same results. And with BenchmarkTools, the deference is made in all cases, so it can’t explain the different behaviors?

nsajko · March 16, 2024, 8:46am

julia> Ref(Int)
Base.RefValue{DataType}(Int64)

Notice the type parameter is DataType, not Type{Int64}.

Only the first two benchmark calls feature a type reference.

nsajko · March 16, 2024, 8:49am

In the case of type values or values of a singleton type, though, you most probably do want type inference/constant propagation/folding. The idea is that a benchmark needs to be representative of “production” code.

Benny · March 16, 2024, 11:34am

I commented again back on that thread, but suffice to say that type assertions may patch up type instabilities but doesn’t necessarily eliminate performance costs, especially since it seems to have more to do with tuple size than type inference.

I don’t think there are consequences generally, but type and T are indeed different. T is a method’s static parameter, not a local variable:

julia> function foo(type::Type{T}) where T
         T = Ref{type}
       end
ERROR: syntax: local variable name "T" conflicts with a static parameter
Stacktrace:
 [1] top-level scope
   @ REPL[79]:1

julia> function foo(type::Type{T}) where T
         type = Ref{type}
       end
foo (generic function with 1 method)

The inability to reassign static parameters make them much easier to treat as compile-time constants.

sijo · March 18, 2024, 8:49am

Agreed, so it all depends what you want to test. Here I’m interested in the performance of a function that receives as parameter a type that is known at execution time but not compile time…

sijo · March 18, 2024, 9:00am

Right, but then wouldn’t you expect better performance with T than with type?

Here with test1 which uses type I get

julia> @b test1($tup, $idx, $type)
19.468 ns (2.01 allocs: 64.170 bytes)

and with test2 which uses T I get

julia> @b test2($tup, $idx, $type)
147.246 ns (2.04 allocs: 145.173 bytes)

Benny · March 18, 2024, 11:49am

Not better but I’d expect it to be the same, as @code_warntype recognizes type as a compile-time constant. @code_llvm and @code_native report matching code. Those are known to specialize on arguments they shouldn’t, but I’m uncertain if that’s the problem here. The methods(...).specializations also list the same call signatures.

Maybe this is a benchmarking artifact that Chairmarks and BenchmarkTools both happen to share? This is what happens when I just @time a big loop:

julia> let tup=tup, idx=idx, type=type
         @time for _ in 1:10^7
           test1(tup, idx, type)
         end
       end
  4.690596 seconds (10.00 M allocations: 305.176 MiB, 0.24% gc time)

julia> let tup=tup, idx=idx, type=type
         @time for _ in 1:10^7
           test2(tup, idx, type)
         end
       end
  4.660442 seconds (10.00 M allocations: 305.176 MiB, 0.14% gc time)

Adding @noinline doesn’t make a significant change to runtime and doesn’t change the allocation report. As you can see, no order-of-magnitude differences, and the average ~~call~~ iteration is different: ~470ns, 1 allocation, 32 bytes.

nsajko · March 18, 2024, 1:29pm

FTR, if the parameter isn’t known at compile time, it’s often the best for performance not to specialize on it, so as to prevent run time dispatch.

Also, it’s not clear if you really want a type assertion at every step of the recursion, instead of just at the end.

sijo · March 18, 2024, 1:48pm

Good point, I think I was confused when talking of “compile time”. As I understand doing f(::T) where T requires specialization so the type cannot be “unknown at compile time”. Instead the compilation will be delayed to occur during runtime, based on runtime dispatch.

In the example, the code path with the assertion is only used at the end of the recursion.

Topic		Replies	Views
Performance discrepancy with multiple dispatch Performance benchmark	4	374	April 27, 2024
Allocations in Comprehensions Performance	2	825	March 1, 2019
Allocations (again...) Performance question , allocations	15	671	November 25, 2022
Memory allocation and usage of dot notation Performance question	6	394	October 19, 2022
Allocations reported by `@time` General Usage	4	556	June 7, 2019

Benchmarking surprises with type assertions

Related topics