Can the overhead of `myT{<:T}` compared to `myT{T}`, where `T` is a concrete type, be avoided?

I understand that Vector{<:Float64} is not the same as Vector{Float64} as it allows the bottom type Union{} to be its element type and is (therefore) not a concrete type:

julia> Vector{Union{}} <: Vector{<:Float64}
true

julia> isconcretetype(Vector{<:Float64})
false

However, in practice, the user cannot construct an instance of Union{}.

Thus, I would imagine that any UnionAll of a composite type MyT{T}, MyT{<:T}, such that T is a concrete type, should add no overhead compared to MyT{T}. For example, Vector{<:Float64} should be as efficient as Vector{Float64}.

Unfortunately, this is not the case as of Julia 1.11.1:

julia> v1 = Vector{Float64}[rand(100) for _ in 1:10];

julia> v2 = Vector{<:Float64}[rand(100) for _ in 1:10];

julia> @benchmark mapreduce(sum, +, $v1)
BenchmarkTools.Trial: 10000 samples with 987 evaluations.
 Range (min … max):  51.570 ns … 102.128 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     52.280 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   53.195 ns Β±   3.458 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

   β–ˆβ–ˆ       ▁▁                                                 ▁
  β–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–‡β–†β–‡β–†β–†β–‡β–‡β–‡β–†β–†β–†β–‡β–„β–†β–…β–…β–†β–†β–…β–†β–…β–„β–„β–„β–„β–„β–…β–†β–†β–…β–†β–…β–†β–…β–…β–…β–…β–…β–†β–…β–…β–†β–…β–‡ β–ˆ
  51.6 ns       Histogram: log(frequency) by time      70.9 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark mapreduce(sum, +, $v2)
BenchmarkTools.Trial: 10000 samples with 858 evaluations.
 Range (min … max):  138.811 ns … 620.629 ns  β”Š GC (min … max): 0.00% … 57.99%
 Time  (median):     140.909 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   154.184 ns Β±  38.408 ns  β”Š GC (mean Β± Οƒ):  2.36% Β±  7.44%

  β–ˆβ–…β–„β–ƒβ–‚β–‚β–β–     ▁                  β–‚β–‚                            ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–‡β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–‡β–†β–†β–…β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ƒβ–…β–ƒβ–β–β–ƒβ–β–„β–β–β–„β–β–β–ƒβ–β–β–β–β–„β–†β–†β–†β–‡β–†β–…β–… β–ˆ
  139 ns        Histogram: log(frequency) by time        340 ns <

 Memory estimate: 160 bytes, allocs estimate: 10.

Can this performance degradation be fixed as the compiler gets more optimized/β€œsmarter,” or are there some fundamental reasons that prevent it from happening? Thanks!

1 Like

That’s true, but you can construct a Vector{Union{}}:

julia> v = Union{}[]
Union{}[]

julia> typeof(v)
Vector{Union{}} (alias for Array{Union{}, 1})

julia> v isa Vector{<:Float64}
true

julia> Vector{<:Float64}[v, [1.0, 2.0]]
2-element Vector{Vector{<:Float64}}:
 Union{}[]
 [1.0, 2.0]
2 Likes

I’m curious, do you have a realistic setting where you would manipulate a Vector{<:Float64} that is not a Vector{Float64}. It seems to me that this overhead will never be faced in practice (@matthias314’s example is rather contrived).

Worse, they can have a non-zero length e.g. Vector{Union{}}(undef, 3) despite not containing any instances. isempty also returns false.

T where T<:Float64 and Union{Union{}, Float64} do simplify to Float64, so Vector{T where T<:Float64} simplifies to Vector{Float64} even though Vector{T} where T<:Float64 does not.