I used that trick to prevent constant propagation/folding (as suggested at the bottom of this doc page). I’m not sure of the consequences but I don’t think that’s the problem here: I don’t deference when using Chairmarks, and still get the same results. And with BenchmarkTools, the deference is made in all cases, so it can’t explain the different behaviors?
In the case of type values or values of a singleton type, though, you most probably do want type inference/constant propagation/folding. The idea is that a benchmark needs to be representative of “production” code.
I commented again back on that thread, but suffice to say that type assertions may patch up type instabilities but doesn’t necessarily eliminate performance costs, especially since it seems to have more to do with tuple size than type inference.
I don’t think there are consequences generally, but type and T are indeed different. T is a method’s static parameter, not a local variable:
julia> function foo(type::Type{T}) where T
T = Ref{type}
end
ERROR: syntax: local variable name "T" conflicts with a static parameter
Stacktrace:
[1] top-level scope
@ REPL[79]:1
julia> function foo(type::Type{T}) where T
type = Ref{type}
end
foo (generic function with 1 method)
The inability to reassign static parameters make them much easier to treat as compile-time constants.
Agreed, so it all depends what you want to test. Here I’m interested in the performance of a function that receives as parameter a type that is known at execution time but not compile time…
Not better but I’d expect it to be the same, as @code_warntype recognizes type as a compile-time constant. @code_llvm and @code_native report matching code. Those are known to specialize on arguments they shouldn’t, but I’m uncertain if that’s the problem here. The methods(...).specializations also list the same call signatures.
Maybe this is a benchmarking artifact that Chairmarks and BenchmarkTools both happen to share? This is what happens when I just @time a big loop:
julia> let tup=tup, idx=idx, type=type
@time for _ in 1:10^7
test1(tup, idx, type)
end
end
4.690596 seconds (10.00 M allocations: 305.176 MiB, 0.24% gc time)
julia> let tup=tup, idx=idx, type=type
@time for _ in 1:10^7
test2(tup, idx, type)
end
end
4.660442 seconds (10.00 M allocations: 305.176 MiB, 0.14% gc time)
Adding @noinline doesn’t make a significant change to runtime and doesn’t change the allocation report. As you can see, no order-of-magnitude differences, and the average call iteration is different: ~470ns, 1 allocation, 32 bytes.
FTR, if the parameter isn’t known at compile time, it’s often the best for performance not to specialize on it, so as to prevent run time dispatch.
Also, it’s not clear if you really want a type assertion at every step of the recursion, instead of just at the end.
Good point, I think I was confused when talking of “compile time”. As I understand doing f(::T) where T requires specialization so the type cannot be “unknown at compile time”. Instead the compilation will be delayed to occur during runtime, based on runtime dispatch.
In the example, the code path with the assertion is only used at the end of the recursion.