I am wondering about the actual advantages over allocating a temporary array. That is, how much do you save? The temp is short-lived, so one would naively hope that generational gc manages to reclaim the alloc quickly. Have you tested?
Otherwise, I’d guess an init function (because precompilation) that mallocs the buffer would be an alternative. If your blocks are reasonably large, and you want thread-safety, then you could also mmap the buffer during initialization (one buffer for every thread). This way, you limit memory consumption: The buffers only eat a handful of bytes in the kernel until they are actually used and the kernel faults them in; and if your function is never called from other threads, then you only fault in a single buffer.
By using a julia-allocated undef-array you rely on array.c and libc heuristics/thresholds for whether the memory consumption is lazy (good: almost free until faulted in) or eager (bad: julia/libc might decide during initialization/compilation that there is a juicy spot of already faulted-in memory, and then your function never gets called at runtime and you wasted all this sweet memory; terrible to reproduce, because dependent on operating system version and load/init/compile/compute order). Additionally this gives you full control over offsets (annoying false sharing; reproducible offsets make this reproducible) and you can share buffers between Float32 and ComplexF64 operations.
PS. The invocations could be
julia> fd=ccall(:memfd_create, Csize_t, (Cstring, Cuint), "foo", 0)
0x0000000000000011
julia> ccall(:ftruncate, Cint, (Cint, Csize_t), fd, 1000)
0
julia> _handle=Base.OS_HANDLE(fd)
RawFD(0x00000011)
julia> _io=open(_handle)
IOStream(<fd 17>)
julia> a1=Mmap.mmap(_io, Matrix{Int}, (4, 4)); a2 = Mmap.mmap(_io, Matrix{Float64}, (4, 4));
julia> a1[1]=1; a2[2]=3.0; @show pointer(a1), pointer(a2);
(pointer(a1), pointer(a2)) = (Ptr{Int64} @0x00007f4581673000, Ptr{Float64} @0x00007f4581672000)
julia> a1
4×4 Array{Int64,2}:
1 0 0 0
4613937818241073152 0 0 0
0 0 0 0
0 0 0 0
julia> a2
4×4 Array{Float64,2}:
4.94066e-324 0.0 0.0 0.0
3.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
Alternatively, one can use a single mapping and use unsafe_wrap. The above bypasses aliasing detection (different virtual memory adresses that are mapped to the same page). Memory mapping games are useful for persistent datastructures: You can create new copy-on-write mappings to the underlying fd. Unfortunately it is very hard on linux to create new cow-mappings to some range of virtual memory, cf https://github.com/JuliaLang/julia/pull/31630.
If you use the unsafe_wrap route, then you can also use Mmap.Anonymous(). For some reason, mmap(::Mmap.Anonymous, args...) needs explicit offset=0; I should probably file a bug for that.