Non-escaping array buffer re-use and the sufficiently smart compiler

b is the memory of _a, while a is an array backed by that memory.

julia> @macroexpand @gc_preserve foo(A, B)
quote
    var"#20###A#258" = A
    var"#21###buffer#259" = StrideArrays.preserve_buffer(var"#20###A#258")
    var"#22###A#260" = B
    var"#23###buffer#261" = StrideArrays.preserve_buffer(var"#22###A#260")
    $(Expr(:gc_preserve, :(foo(StrideArrays.maybe_ptr_array(var"#20###A#258"), StrideArrays.maybe_ptr_array(var"#22###A#260"))), Symbol("#21###buffer#259"), Symbol("#23###buffer#261")))
end

I was just manually doing what @gc_preserve will do once I update it for working on blocks instead of calls (aside: Expr(:gc_preserve, ...) is from GC.@preserve).
The reason for separating the memory is that GC.@preserve will generally allocate the entire object if the memory to be protected is held by a struct. Preserving only the memory will preserve said memory without forcing it to be heap allocated.

Ah, that’s probably LoopVectorization complaining about SMatrix{6, 7, Float64, 42}.
I haven’t implemented support for SMatrix yet, because I haven’t found out a way to actually make them fast. The two best possibilities so far are:

  1. convert to MArray, then convert back. In my tests, even thought the MArrays weren’t heap allocated, it’d often actually copy them onto a different part of the stack anyway for no reason.
  2. add dummy wrapper for them that supports the StridedPointer interface, and cross fingers that code gen on memory loads is okay. It wasn’t in my tests.

EDIT: Actually, I’m tired. StaticArrays should work with LoopVectorization in that they should fail check_args so that it uses a fallback array. This must have come from StrideArrays itself trying to get stridedpointers without checking first.

2 Likes