Static vectors of vectors slow?

first, I introduce some very similar ways to represent some boring data, with reasonable benchmarks:

julia> a=SVector{3, MVector{3,Int}}([MVector{3,Int}([1,1,1]),MVector{3,Int}([1,1,1]),MVector{3,Int}([1,1,1])])
3-element SVector{3, MVector{3, Int64}} with indices SOneTo(3):
 [1, 1, 1]
 [1, 1, 1]
 [1, 1, 1]

julia> struct mysvec
           one::MVector{3,Int}
           two::MVector{3,Int}
           thr::MVector{3,Int}
       end

julia> Base.:+(a::mysvec, b::mysvec) = a.one+b.one, a.two+b.two, a.thr+b.thr

julia> b=mysvec([1,1,1], [1,1,1], [1,1,1])
mysvec([1, 1, 1], [1, 1, 1], [1, 1, 1])

julia> c=[[1,1,1],[1,1,1],[1,1,1]]
3-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 1]
 [1, 1, 1]

julia> @benchmark $a+$a
BenchmarkTools.Trial: 10000 samples with 998 evaluations per sample.
 Range (min … max):  14.578 ns …  11.243 ΞΌs  β”Š GC (min … max):  0.00% … 99.60%
 Time  (median):     29.597 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   37.294 ns Β± 262.228 ns  β”Š GC (mean Β± Οƒ):  18.37% Β±  2.63%

     β–β–ˆβ–‡                                                        
  β–β–‚β–„β–ˆβ–ˆβ–ˆβ–‡β–„β–ƒβ–ƒβ–ƒβ–‚β–‚β–ƒβ–ƒβ–‚β–β–β–‚β–ƒβ–ƒβ–„β–…β–†β–…β–…β–ƒβ–ƒβ–‚β–‚β–‚β–β–‚β–β–‚β–‚β–ƒβ–„β–†β–†β–†β–†β–…β–„β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–β–‚β–β– β–ƒ
  14.6 ns         Histogram: frequency by time         53.1 ns <

 Memory estimate: 96 bytes, allocs estimate: 3.

julia> @benchmark $b+$b
BenchmarkTools.Trial: 10000 samples with 998 evaluations per sample.
 Range (min … max):  18.183 ns …  11.171 ΞΌs  β”Š GC (min … max):  0.00% … 99.42%
 Time  (median):     32.191 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   40.879 ns Β± 265.808 ns  β”Š GC (mean Β± Οƒ):  18.01% Β±  2.81%

     β–„β–ˆβ–…                                                        
  β–β–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–„β–…β–†β–…β–„β–‚β–‚β–‚β–ƒβ–„β–…β–ˆβ–ˆβ–ˆβ–†β–„β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–…β–†β–‡β–†β–†β–†β–†β–…β–…β–…β–…β–„β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–β–β–β–β– β–ƒ
  18.2 ns         Histogram: frequency by time         58.7 ns <

 Memory estimate: 96 bytes, allocs estimate: 3.

julia> @benchmark $c+$c
BenchmarkTools.Trial: 10000 samples with 984 evaluations per sample.
 Range (min … max):   50.980 ns …  13.368 ΞΌs  β”Š GC (min … max):  0.00% … 98.93%
 Time  (median):     113.817 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   147.831 ns Β± 539.619 ns  β”Š GC (mean Β± Οƒ):  22.46% Β±  6.22%

                         β–β–„β–†β–ˆβ–†β–…β–„β–„β–ƒβ–                              
  β–‚β–β–‚β–‚β–‚β–ƒβ–„β–‡β–‡β–‡β–ˆβ–ˆβ–†β–…β–„β–„β–„β–…β–…β–…β–…β–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–…β–…β–…β–…β–…β–†β–…β–†β–†β–„β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–ƒβ–ƒβ–‚β–‚β–‚β–ƒβ–ƒβ–‚ β–„
  51 ns            Histogram: frequency by time          190 ns <

 Memory estimate: 320 bytes, allocs estimate: 8.

however, if I release one condition, the static vector becomes slower than a normal vector:

julia> a=SVector{3, MVector}([MVector{3,Int}([1,1,1]),MVector{4,Int}([1,1,1,1]),MVector{3,Int}([1,1,1])])
3-element SVector{3, MVector} with indices SOneTo(3):
 [1, 1, 1]
 [1, 1, 1, 1]
 [1, 1, 1]

julia> struct mysvec2
           one::MVector{3,Int}
           two::MVector{4,Int}
           thr::MVector{3,Int}
       end

julia> Base.:+(a::mysvec2, b::mysvec2) = a.one+b.one, a.two+b.two, a.thr+b.thr

julia> b=mysvec2([1,1,1], [1,1,1,1], [1,1,1])
mysvec([1, 1, 1], [1, 1, 1, 1], [1, 1, 1])

julia> c=[[1,1,1],[1,1,1,1],[1,1,1]]
3-element Vector{Vector{Int64}}:
 [1, 1, 1]
 [1, 1, 1, 1]
 [1, 1, 1]

julia> @benchmark $a+$a
BenchmarkTools.Trial: 10000 samples with 200 evaluations per sample.
 Range (min … max):  362.120 ns …  64.566 ΞΌs  β”Š GC (min … max): 0.00% … 98.97%
 Time  (median):     450.567 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   477.896 ns Β± 882.369 ns  β”Š GC (mean Β± Οƒ):  2.60% Β±  1.40%

                 β–β–ˆβ–ˆβ–‡β–†β–…β–‚β–‚                                        
  β–‚β–‚β–β–β–β–‚β–‚β–‚β–β–‚β–‚β–‚β–‚β–ƒβ–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–‡β–‡β–‡β–†β–†β–…β–„β–„β–„β–„β–ƒβ–ƒβ–ƒβ–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–ƒβ–‚β–‚β–‚β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚ β–„
  362 ns           Histogram: frequency by time          609 ns <

 Memory estimate: 192 bytes, allocs estimate: 6.

julia> @benchmark $b+$b
BenchmarkTools.Trial: 10000 samples with 998 evaluations per sample.
 Range (min … max):  16.423 ns …  12.970 ΞΌs  β”Š GC (min … max):  0.00% … 99.50%
 Time  (median):     28.776 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   41.415 ns Β± 315.749 ns  β”Š GC (mean Β± Οƒ):  21.19% Β±  2.81%

      β–†β–ˆβ–…     ▁▄▅▄▂                                             
  β–β–β–‚β–…β–ˆβ–ˆβ–ˆβ–…β–„β–†β–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–…β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–‚β–‚β–‚β–ƒβ–„β–†β–†β–†β–„β–„β–„β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–β–β–β–β–β–β–β–β–β–β– β–ƒ
  16.4 ns         Histogram: frequency by time           68 ns <

 Memory estimate: 112 bytes, allocs estimate: 3.

julia> @benchmark $c+$c
BenchmarkTools.Trial: 10000 samples with 983 evaluations per sample.
 Range (min … max):   57.537 ns …  15.545 ΞΌs  β”Š GC (min … max):  0.00% … 98.95%
 Time  (median):     109.860 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   147.736 ns Β± 629.855 ns  β”Š GC (mean Β± Οƒ):  24.53% Β±  5.75%

                      β–β–„β–†β–ˆβ–ˆβ–‡β–…β–‚                                   
  β–β–β–β–β–‚β–‚β–ƒβ–…β–…β–…β–…β–„β–…β–„β–„β–…β–…β–…β–†β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–…β–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–‚β–‚β–‚β–‚β–‚β–β–‚β–β–‚β–‚β–β–β– β–ƒ
  57.5 ns          Histogram: frequency by time          193 ns <

 Memory estimate: 336 bytes, allocs estimate: 8.

I included the struct version to show that there is a theoretical gain to be made, but I would like to just treat it like a normal vector without having to define a bunch of struct operations which would almost certainly be less efficient than normal vector indexing/iterating methods. what is the correct way to make a SVector like this?

As far as I know you can’t. Since the elements are of different type, the SVector needs to package them as an abstract type (in this case MVector without parameters), and when indexing the code has to dynamically see what type was retrieved.

You could consider using an SVector{3, Vector{Int}} or a tuple of your MVectors instead, but it depends on how you’re going to use them whether it makes sense.

2 Likes

nice, a SVector{3, Vector{Int}} is a bit faster than a default vector; but it still loses to the struct. my use case is mostly adding them all together like in the benchmark I’m testing, or sometimes just a specific column like a[2].+=b[2].

julia> d=SVector{3, Vector{Int}}([[1,1,1],[1,1,1,1],[1,1,1]])                  
3-element SVector{3, Vector{Int64}} with indices SOneTo(3):                    
 [1, 1, 1]                                                                     
 [1, 1, 1, 1]                                                                  
 [1, 1, 1]                                                                     

julia> @benchmark $d+$d
BenchmarkTools.Trial: 10000 samples with 990 evaluations per sample.
 Range (min … max):   44.966 ns …  18.776 ΞΌs  β”Š GC (min … max):  0.00% … 99.46%
 Time  (median):      74.091 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   105.874 ns Β± 557.036 ns  β”Š GC (mean Β± Οƒ):  25.05% Β±  4.85%

          β–β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–†β–‡β–ˆβ–†β–ƒ                                           
  β–β–β–β–‚β–‚β–ƒβ–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–„β–ƒβ–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β– β–ƒ
  45 ns            Histogram: frequency by time          150 ns <

 Memory estimate: 256 bytes, allocs estimate: 6.

You could use SmallVector or MutableSmallVector from SmallCollections.jl. They are like SVector (= FixedVector in SmallCollections.jl), but with a length up to some predefined capacity.

julia> using SmallCollections, Chairmarks

julia> b = mysvec2([1,1,1], [1,1,1,1], [1,1,1]);
julia> c = FixedVector{3,SmallVector{4,Int}}([[1,1,1], [1,1,1,1], [1,1,1]]);
julia> d = FixedVector{3,MutableSmallVector{4,Int}}([[1,1,1], [1,1,1,1], [1,1,1]]);

julia> @b $b+$b, $c+$c, $d+$d
(29.692 ns (3 allocs: 112 bytes), 5.175 ns, 5.991 ns)

By default, the sum of two MutableSmallVectors is a SmallVector (immutable). That’s why there are no allocations. You could change this manually:

julia> @b map(MutableSmallVector, $d+$d)
24.289 ns (3 allocs: 144 bytes)

EDIT: If you modify whole vectors, but not individual entries, then something like MutableFixedVector{3,SmallVector{4,Int}} might be the better choice. Also, if you can get by with Int8 or Int16 instead of Int, then this would be faster if your vectors get longer.

1 Like