There were a lot of questions and discussions about reinterpret vs unsafe_wrap since the new reinterpret arrived in 0.7, so I thought to make a thread to collect some of the answers and hopefully get some authoritative answers from @Keno archived here. I’ll start by describing how the current state looks like to me, and hope to be corrected if I get something wrong, and also hope for examples by others.
- What is this about? We want to reinterpret an array’s buffer. The most common examples are (a) between T and
SVector{T}, where T = Int64, Int32, Float64, Float32, etc, (b) real and complex, (c) various structs. - Why would you use the
unsafe_wrapinstead ofreinterpret? - Why and when is
unsafe_wrapbad? - How do you safely use
unsafe_wrapwhen you must?
Let’s give more answer to 1 via an example (Version 1.0.0 (2018-08-08)):
using StaticArrays
using Random
using LinearAlgebra
using BenchmarkTools
n=100;m=3000;
M_sv=Vector{SVector{n,Float64}}(undef,m);
V_re = reinterpret(Float64,M_sv);
M_re=reshape(V_re, (n,m));
M_uw = unsafe_wrap(Array,reinterpret(Ptr{Float64},pointer(M_sv)), (n,m));
rand!(M_uw);
and now we can give a partial answer to 2: Certain operations are slow on ReinterpretArray between certain types on certain julia versions. Somehow I believed that the below had gotten patched in the meantime, but it is slow on the archlinux 1.0.0 version, so let’s see:
@btime sum(sum($M_sv)); #102.901 μs (3 allocations: 1.77 KiB)
@btime sum($V_re);#7.836 ms (0 allocations: 0 bytes)
@btime sum($M_re);#8.544 ms (0 allocations: 0 bytes)
@btime sum($M_uw);#103.828 μs (0 allocations: 0 bytes)
@btime *(M_re',M_re);
res=Matrix{Float64}(undef,m,m);
@btime mul!(res,M_uw',M_uw);# 51.057 ms (1 allocation: 16 bytes)#51.057 ms (1 allocation: 16 bytes)
@btime mul!(res,M_re',M_re);#50.932 ms (1 allocation: 16 bytes)
So the answer to (2) is that reinterpret on arrays produces a ReinterpretArray (that you may need to reshape). This makes access sometimes slower (hopefully already/soon fixed in many cases), could sometimes give longer compile times, and may give you trouble if you or your dependecies dispatch on Array instead of more abstract arrays. You or your dependencies might dispatch on Array because you want to hand your data over to C/Fortran/etc codes that expect a specific layout, or because you did not want to think about custom array types. We see that LinearAlgebra appears to be capable of passing through the pointers of reinterpreted arrays to julia’s BLAS, but you’ll need to see whether your dependencies get this right.
Now, why is unsafe_wrap problematic? I see three issues:
- It is unstable. Internals and then all the answers to all these questions may change, and then your code produces wrong results.
- Type based aliasing analysis. This was the main reason for the new
reinterpret(it used to be mostly equivalent tounsafe_wrap). An array is essentially a pointer to a buffer with some metadata; and the compiler assumes that the buffers to differently typed buffers cannot overlap (alias). This makes code faster, but may produce wrong results if you shuffle data inside the same buffer, once accessed viaM_svand once viaM_uw. So, don’t do that! - Relocation and object lifetime. In the above example,
M_svbelieves that it owns the buffer, and nobody else holds pointers to it. IfM_svdies, the buffer will befreed and access toM_uwcan corrupt memory. If youpush!toM_sv, the buffer can berealloced, and access toM_svcan corrupt memory (so don’t push to stuff that has living wraps!).
Now, how do you use unsafe_wrap safely? Well, don’t ever do anything that aliases in the same “context”, and make sure that the base lives at least one “context” longer than the unsafe_wrap. unsafe_wrap is very cheap: You can simply discard it after use and make a new wrap if you need it again (outside of inner loops). What does “context” mean? Well, as far as the compiler can see during optimization. I think that @noinline function boundaries are enough separation? That is, code like the following should be OK for aliasing if foo! and bar! are @noinline. Likewise, objects can be freed before execution reaches the line-number where they go out of scope, if the compiler infers that they are not used afterwards; but @noinline function boundaries should prevent the compiler from noticing this?
I’d like @Keno’s confirmation on these points before anyone trusts me on this. I am not sure how far IPO looks; is it necessary to jump through further hoops to prevent the compiler from discovering the juicy but poisonous no-alias information?
@noinline f(M_uw, M_sv)
for i=1:10
foo!(M_uw)
bar!(M_sv)
end
end