Isn’t that heap allocated?
I do not know of any stack-allocated objects in Julia that are mutable. I thought Julia drew a neat divide between immutable structs and mutable structs, where only some of the former are stack allocated (ie, when they’re isbits
/don’t contain any reference types).
They’re just a mutable struct wrapping a tuple. Create and destroy them, and you trigger the garbage collector.
julia> using StaticArrays, BenchmarkTools
julia> @benchmark @MVector randn(4)
BenchmarkTools.Trial:
memory estimate: 48 bytes
allocs estimate: 1
--------------
minimum time: 25.867 ns (0.00% GC)
median time: 27.715 ns (0.00% GC)
mean time: 37.415 ns (22.50% GC)
maximum time: 48.765 μs (99.92% GC)
--------------
samples: 10000
evals/sample: 997
julia> @benchmark @SVector randn(4)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 21.204 ns (0.00% GC)
median time: 22.369 ns (0.00% GC)
mean time: 22.478 ns (0.00% GC)
maximum time: 52.456 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 997
julia> @benchmark @MVector zeros(4)
BenchmarkTools.Trial:
memory estimate: 48 bytes
allocs estimate: 1
--------------
minimum time: 4.744 ns (0.00% GC)
median time: 8.484 ns (0.00% GC)
mean time: 18.205 ns (47.74% GC)
maximum time: 49.892 μs (99.97% GC)
--------------
samples: 10000
evals/sample: 999
julia> @benchmark @SVector zeros(4)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 0.020 ns (0.00% GC)
median time: 0.030 ns (0.00% GC)
mean time: 0.029 ns (0.00% GC)
maximum time: 26.590 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
pointer_from_objref
works just fine on them, as does unsafe_load
and unsafe_store!
. In fact, setindex! is defined using pointer_from_objref
and unsafe_store!
.
I made the HN comment because neither foobar’s question above, or my question on masked loads and stores had a positive answer.
Neither of us could even achieve what we wanted via llvmcall.
So if this is possible:
meaning you can’t use masked load/store operations to vectorize code when the array dimensions aren’t a multiple of SIMD-vector-width.
I’d love to learn how!