Performant-wise, what is the best way to define (many) local arrays?

Here you’ve allocated three temporary Vector objects.

            A = [e1 ;; e2 ;; e3]

Now separately here, you concatenate them into a further array which also needs to be allocated and managed by the garbage collector.

All these memory allocations are probably what makes this code slow. Rather, it’s much better to use the 2D array syntax. Then the compiler can allocate only a single array and fill it once.

As you’ve seen StaticArrays completely fixes this (for small arrays), by allowing the compiler to remove the GC-managed allocation completely.

The precedence of the surface syntax can’t affect this, because parsing the syntax happens only a single time, and very early during compilation. Even the column major ordering shouldn’t really affect this for small arrays — that should only become very relevant for moderate to large size arrays which don’t fit into the processor’s cache.

However, it’s possible that the function inside Base for dealing with multidimensional concatenation is is less efficient for some reason. Consider the difference in the way that these syntaxes are actually passed to the functions inside Base:

julia> Meta.@lower [a b ; c d]
:($(Expr(:thunk, CodeInfo(
1 ─ %1 = Core.tuple(2, 2)
│   %2 = Base.hvcat(%1, a, b, c, d)           **********
└──      return %2
))))

julia> Meta.@lower [a; b ;; c ; d]
:($(Expr(:thunk, CodeInfo(

1 ─ %1 = Core.tuple(2, 2)
│   %2 = Base.hvncat(%1, false, a, b, c, d)   **********
└──      return %2
))))

Base.hvncat is a more general and very recent addition to the standard library, so it’s possible this isn’t quite as efficient as Base.hvcat.

1 Like