Is operating a vector that have abstract type really slow?

using BenchmarkTools
abstract type SuperType end
mutable struct TypeA <: SuperType
    a::Int64
    b::Int64
end
mutable struct TypeB <: SuperType
    a::Int64
    b::Float64
end

function Iter(objs::Vector{SuperType})
    for _ in 1:1000
        if rand() <= 0.5
            push!(objs,TypeA(1,2))
        else
            push!(objs,TypeB(2,2.0))
        end
    end
end
    
function Iter(objs::Vector{TypeA})
    for _ in 1:1000
        if rand() <= 0.5
            push!(objs,TypeA(1,2))
        else
            push!(objs,TypeA(2,2))
        end
    end
end

By defining an abstract type with two son types, I made two vectors, and tested the push!() speed.

abs = Vector{SuperType}()
@benchmark Iter(abs)

output:

BenchmarkTools.Trial: 
  memory estimate:  31.25 KiB
  allocs estimate:  1000
  --------------
  minimum time:     19.018 Ī¼s (0.00% GC)
  median time:      20.635 Ī¼s (0.00% GC)
  mean time:        904.554 Ī¼s (89.30% GC)
  maximum time:     2.039 s (100.00% GC)
  --------------
  samples:          5639
  evals/sample:     1

and

as = Vector{TypeA}()
@benchmark Iter(as)

output:

BenchmarkTools.Trial: 
  memory estimate:  31.25 KiB
  allocs estimate:  1000
  --------------
  minimum time:     19.122 Ī¼s (0.00% GC)
  median time:      20.183 Ī¼s (0.00% GC)
  mean time:        1.152 ms (93.90% GC)
  maximum time:     2.415 s (100.00% GC)
  --------------
  samples:          4986
  evals/sample:     1

Itā€™s quite wired that operating the abstract type vector is not slower than the specific one. Am I doing anything wrong?

Is that because no matter what type of the vector is, what are stored are always pointers?

Do the timings change of the struct are not mutable?

My best guess is that this is inning and Union splitting. The first vector is probably being stored with roughly the equivalent of a c Union. That way, each item would just be 16 bytes + a bit for which type.

Maybe that is true. Is there any way to check how the array storing the variables? Pointers or variables themselves?

Hereā€™s the results by changing structure type to immutable.

abs = Vector{SuperType}()
@benchmark Iter(abs)
BenchmarkTools.Trial: 
  memory estimate:  31.25 KiB
  allocs estimate:  1000
  --------------
  minimum time:     30.927 Ī¼s (0.00% GC)
  median time:      36.210 Ī¼s (0.00% GC)
  mean time:        1.405 ms (93.30% GC)
  maximum time:     3.320 s (100.00% GC)
  --------------
  samples:          5591
  evals/sample:     1
as = Vector{TypeA}()
@benchmark Iter(as)
BenchmarkTools.Trial: 
  memory estimate:  31.25 KiB
  allocs estimate:  1000
  --------------
  minimum time:     34.405 Ī¼s (0.00% GC)
  median time:      44.396 Ī¼s (0.00% GC)
  mean time:        1.034 ms (94.85% GC)
  maximum time:     2.630 s (100.00% GC)
  --------------
  samples:          5446
  evals/sample:     1

Seems making no much difference.

No this has nothing to do with union splitting.

The main reason is that you arenā€™t operating on the vector element. You just store to it so thereā€™s no slow down from type instability.

When itā€™s mutable, thereā€™s also no slow down from allocation since the pointer is stored in either case. When itā€™s immutable, thereā€™s no slow down from allocation since the stored value is constant and the allocation is done at compile time.

I believe you should use Iter($abs) or Iter($as). I donā€™t think is matters much in this case though.

No!

Arrays never stores ā€œvariableā€. They always store the reference. However, if the eltype is bits type, the reference will be stored using the value instead of pointer.

2 Likes

Okay, thanks.
So, if the arrays always store reference, would that be slower to operating them in Julia than that in C++ when arrays stored by themselves?

Can I understand that the mutable structs are stored as pointers and the immutable ones are stored as themselves?

Depending on what you do. Storing the value is not necessarily more efficient. FWIW, this is the whole reason you need to worry about copying in C++ā€¦ Also, storing by reference does not mean storing the pointer.

No. The isbits eltype (or field type) are. This has nothing to do with the value, only the declared field type.

1 Like

Humā€¦ Seems I have a lot to learn. I just thought the pointer is reference. Whatā€™s their difference BTW? Or where can I study these knowledge? In fact Iā€™m not major in computer science or programming.

By your indicate. I tested the real operating function:

function Operate(objs::Vector{T}) where {T <: SuperType}
    for obj in objs
        obj.a += 1
    end
end

The results are are significant:

@benchmark Operate(abs)
BenchmarkTools.Trial: 
  memory estimate:  54.69 KiB
  allocs estimate:  3500
  --------------
  minimum time:     105.313 Ī¼s (0.00% GC)
  median time:      109.868 Ī¼s (0.00% GC)
  mean time:        262.076 Ī¼s (10.10% GC)
  maximum time:     162.335 ms (99.92% GC)
  --------------
  samples:          10000
  evals/sample:     1

and

@benchmark Operate(as)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.145 Ī¼s (0.00% GC)
  median time:      1.228 Ī¼s (0.00% GC)
  mean time:        2.595 Ī¼s (0.00% GC)
  maximum time:     1.403 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     10

Reference here just means it is refering to the object, not in the C++ reference sense.

Pointer is a more unambiguous low level concept. C++ reference are implemented as pointers.

1 Like