Can Julia optimize mutable static arrays to be allocated on the stack?

Yes.

julia> using StaticArrays, BenchmarkTools

julia> if VERSION >= v"1.7.0-beta"
         @inline exp_fast(x) =  Base.Math.exp_impl_fast(x, Val(:ℯ))
       else
         exp_fast(x) = exp(x)
       end
exp_fast (generic function with 1 method)

julia> function alloctest(x)
         y = MVector(x)
         @inbounds @simd ivdep for i ∈ eachindex(y)
           y[i] = Base.Math.exp_impl_fast(y[i], Val(:ℯ))
         end
         s = zero(eltype(y))
         @fastmath for i ∈ eachindex(y)
           s += y[i]
         end
         s
       end
alloctest (generic function with 1 method)

julia> x = @SVector rand(32);

julia> @btime alloctest($x)
  11.995 ns (0 allocations: 0 bytes)
56.18908775961786

The assembly confirms that y is in fact stack allocated (loads and stores use rsp, the 64-bit-mode stack pointer).
Note that in many cases, the MArray will not be allocated at all, existing only in the CPU’s registers if at all.

6 Likes