Very strange timing results with StaticArrays

Running Julia 1.9.3 on an M2 MacBook Air, macOS Ventura 13.4.1. Performing almost any operation requiring a moderate to large StaticArray is extremely slow. I mean extremely slow, measurable in tens of minutes. And this does not get reported by the @time macro. For instance:

julia> using StaticArrays
julia> @time us2 = SMatrix{2,2}(randn(2,2));
0.000013 seconds (3 allocations: 208 bytes)
julia> @time us3 = SMatrix{2,200}(randn(2,200));
0.000032 seconds (3 allocations: 6.500 KiB)
julia> @time us4 = SMatrix{2,2000}(randn(2,2000));
0.000149 seconds (4 allocations: 62.672 KiB)
julia> @time us5 = SMatrix{2,4000}(randn(2,4000));
0.000316 seconds (4 allocations: 125.172 KiB)

… but as measured by my wristwatch, the last operation took a solid 60 seconds. This seems to affect every array operation - so far I’ve verified it with array creation, array transpose, matrix multiplication, and plotting.

And this appears to be roughly exponential in the size of the array; for fun I timed how long it takes to transpose a 12x8000 matrix:

julia> @time us’
1148.808017 seconds (13.24 M allocations: 411.570 MiB)

e.g. 19 minutes according to @time, but nearly an hour by the wall clock.

Some of this seems to follow the regular Julia JIT delay pattern, e.g. taking that inverse the second time took a small fraction of the time for the first one:

julia> @time us’
0.002370 seconds (1 allocation: 750.062 KiB)

which was instantaneous by the wall clock as well. But that doesn’t happen when creating arrays, it takes 60 seconds every time I allocate a 2x4000 array. And in any case, why would it take an hour to JIT a transpose routine? And why the enormous memory allocation size for the first run but not the second?

This can’t be right, can it?

Actually it is:

“Note that in the current implementation, working with large StaticArrays puts a lot of stress on the compiler, and becomes slower than Base.Array as the size increases. A very rough rule of thumb is that you should consider using a normal Array for arrays larger than 100 elements.”

(from the readme)

3 Likes

I knew that, I guess the amount of slowness surprised me. And the fact that @time doesn’t accurately reflect wall clock time. In my experience, it ordinarily reports compilation time when a compile happens.

I think the recommendation is @time @eval to measure the compilation time as well.

2 Likes

From the docs:

 In some cases the system will look inside the @time expression and compile some of the called code before execution of the top-level expression begins. When that happens, some compilation time will not be counted. To include this time you can run
  @time @eval ....
2 Likes