Vectors with elements of same type but different parametric type

I need to work with a set of points that might each have a different type. Consider as a rough example:

abstract type AbstractLabel end
struct LabelA <: AbstractLabel end
struct LabelB <: AbstractLabel end
struct LabelC <: AbstractLabel end
struct PointWithLabel{L<:AbstractLabel} 
    xy::Vector{Float64}
    label::Type{L}
    PointWithLabel(x::Float64, y::Float64, label::Type{L}) where {L} = new{L}([x, y], label)
    PointWithLabel(x, y, label) = PointWithLabel(Float64(x), Float64(y), label)
    PointWithLabel{L}(x, y) where {L} = PointWithLabel(x, y, L)
end
const PointA = PointWithLabel{LabelA}
const PointB = PointWithLabel{LabelB}
const PointC = PointWithLabel{LabelC}

I could then define for example the vector

p1 = PointA(0.5, 0.3)
p2 = PointB(0.17, -0.3)
p3 = PointC(0.0, 0.0)
p4 = PointB(0.2, 0.5)
vec1 = [p1, p2, p3, p4]
4-element Vector{PointWithLabel}:
 PointA([0.5, 0.3], LabelA)
 PointB([0.17, -0.3], LabelB)
 PointC([0.0, 0.0], LabelC)
 PointB([0.2, 0.5], LabelB)

Alternatively, I could define the vector as

vec2 = Vector{PointWithLabel{L} where L}([p1, p2, p3, p4])
4-element Vector{PointWithLabel{L} where L}:
 PointA([0.5, 0.3], LabelA)
 PointB([0.17, -0.3], LabelB)
 PointC([0.0, 0.0], LabelC)
 PointB([0.2, 0.5], LabelB)

Which of these vectors would be better to work with for performance? I’ll be e.g. solving differential equations with many thousands of these types of points, and using the labels in the type of the point to exploit multiple dispatch for defining new addition formulas (e.g. +(::PointA, ::PointB) and +(::PointC, ::PointA)). I initially thought vec1 would be better since the element type is not concrete, !isconcretetype(PointWithLabel), although vec2 requires me to define methods specific for the type PointWithLabel{L} where L. Alternatively, I could work with a Vector{Union{PointA, PointB, PointC}}, but I remember from somewhere (I think) that unions are not always performant.

In practice, I believe vec1 and vec2 are the same: they’re essentially just vectors of pointers to your Point{A|B|C} off in the heap. On the other hand, a vector of type Union{PointA, PointB, PointC} can store your types inline, at least if it is an isbits type.

But let us benchmark using BenchmarkTools:

# Vector of concrete type
pointAs = [PointA(xy...) for xy in eachcol(rand(2, 1_000_000))];
typeof(pointAs) == Vector{PointA}
@btime foreach(p -> p.label, pointAs)
# 241.399 ns (0 allocations: 0 bytes)

# Vector of non-concrete type
pointABCs = [rand([PointA, PointB, PointC])(xy...) for xy in eachcol(rand(2, 1_000_000))];
typeof(pointABCs) == Vector{PointWithLabel}
@btime foreach(p -> p.label, pointABCs)
# 225.922 μs (0 allocations: 0 bytes)

# Vector of union of concrete types
unionPoints = Array{Union{PointA, PointB, PointC}}(undef, 1_000_000)
unionPoints .= pointABCs
@btime foreach(p -> p.label, unionPoints)
# 5.742 ms (0 allocations: 0 bytes)

These results are quite surprising to me! Since your type isn’t isbits, I would have expected these to perform all about the same, with some small overhead involved handling types with second and third examples. Clearly, that overhead is large, since the second benchark x1000 slower than the first. And I have no explanation for the further 10x decrease in performance observed for the third benchmark.

EDIT: Switch benchmarks from map to foreach, since map is dominated by allocations.

1 Like

I see a large improvement if I convert your type to isbits. Note that you do not need to store the type as a field for your struct, since it’s already stored in the type itself. Additionally, if xy is guaranteed to be just 2 Float64 in size, then a vector is a poor choice, and I think an SVector from StaticArrays package would be better to avoid the tiny heap allocation In this example I just use a tuple:

struct PointWithLabel{L<:AbstractLabel}
    xy::Tuple{Float64, Float64}
end

function getlabel(::PointWithLabel{L}) where L
    return L
end

Then benchmarking as before:

@btime foreach(getlabel, pointAs);
# 10.590 ns (0 allocations: 0 bytes)

@btime foreach(getlabel, pointABCs);
# 28.406 ms (0 allocations: 0 bytes)

@btime foreach(getlabel, unionPoints)
# 14.096 μs (0 allocations: 0 bytes)

In the case of the first and third benchmarks, we are accessing values stored directly inline to the array. And so these are much faster than before. Clearly, however, since the 3rd example is still 1000x slower than the first, there’s a lot of overhead involved with resolution of types in unions. Meanwhile, the middle example is stored as an array of pointers, and that pointer indirection is clearly very expensive.

1 Like

Thanks for these benchmarks! Very interesting. For your replies, should “ns” be “μs”? I assume so since you say the results are very similar. I get similar results too those:

686.300 μs
410.500 μs
6.434 ms

(in order of your first reply). Very interesting.

For your last reply: I think I’m going to have to do a lot of mutation on the elements of xy, so I don’t think I could treat it as immutable even though indeed all vectors are two-vectors. e.g. I’d be doing a lot of xy .+= something else. Unless you had any suggestions there?

You won’t be able to mutate, but if the overall struct remains relatively small, you can trivially create a new struct with the updated internal values. For example, with StaticArrays, even though they are immutable they still achieve the same semantics:

a = SVector{Float64, 4}(1, 2, 3, 4)
b = SVector{Float64, 4}(1, 2, 3, 4)
a += b # => a = a + b

In the last line, however, a isn’t mutated it is reassigned.

1 Like

Thanks! I’ll look into that approach. I also just found out about MVectors which I’ll look into. Appreciate your help.