I understand that Vector{<:Float64}
is not the same as Vector{Float64}
as it allows the bottom type Union{}
to be its element type and is (therefore) not a concrete type:
julia> Vector{Union{}} <: Vector{<:Float64}
true
julia> isconcretetype(Vector{<:Float64})
false
However, in practice, the user cannot construct an instance of Union{}
.
Thus, I would imagine that any UnionAll
of a composite type MyT{T}
, MyT{<:T}
, such that T
is a concrete type, should add no overhead compared to MyT{T}
. For example, Vector{<:Float64}
should be as efficient as Vector{Float64}
.
Unfortunately, this is not the case as of Julia 1.11.1:
julia> v1 = Vector{Float64}[rand(100) for _ in 1:10];
julia> v2 = Vector{<:Float64}[rand(100) for _ in 1:10];
julia> @benchmark mapreduce(sum, +, $v1)
BenchmarkTools.Trial: 10000 samples with 987 evaluations.
Range (min β¦ max): 51.570 ns β¦ 102.128 ns β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 52.280 ns β GC (median): 0.00%
Time (mean Β± Ο): 53.195 ns Β± 3.458 ns β GC (mean Β± Ο): 0.00% Β± 0.00%
ββ ββ β
βββββββββββββββββββββββββββββββ
β
βββ
ββ
ββββββ
βββ
ββ
ββ
β
β
β
β
ββ
β
ββ
β β
51.6 ns Histogram: log(frequency) by time 70.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark mapreduce(sum, +, $v2)
BenchmarkTools.Trial: 10000 samples with 858 evaluations.
Range (min β¦ max): 138.811 ns β¦ 620.629 ns β GC (min β¦ max): 0.00% β¦ 57.99%
Time (median): 140.909 ns β GC (median): 0.00%
Time (mean Β± Ο): 154.184 ns Β± 38.408 ns β GC (mean Β± Ο): 2.36% Β± 7.44%
ββ
ββββββ β ββ β
ββββββββββββββββββββββββββββββ
βββββββ
βββββββββββββββββββββββ
β
β
139 ns Histogram: log(frequency) by time 340 ns <
Memory estimate: 160 bytes, allocs estimate: 10.
Can this performance degradation be fixed as the compiler gets more optimized/βsmarter,β or are there some fundamental reasons that prevent it from happening? Thanks!