There were a lot of questions and discussions about reinterpret
vs unsafe_wrap
since the new reinterpret
arrived in 0.7, so I thought to make a thread to collect some of the answers and hopefully get some authoritative answers from @Keno archived here. I’ll start by describing how the current state looks like to me, and hope to be corrected if I get something wrong, and also hope for examples by others.
- What is this about? We want to reinterpret an array’s buffer. The most common examples are (a) between T and
SVector{T}
, where T = Int64, Int32, Float64, Float32, etc, (b) real and complex, (c) various structs. - Why would you use the
unsafe_wrap
instead ofreinterpret
? - Why and when is
unsafe_wrap
bad? - How do you safely use
unsafe_wrap
when you must?
Let’s give more answer to 1 via an example (Version 1.0.0 (2018-08-08)):
using StaticArrays
using Random
using LinearAlgebra
using BenchmarkTools
n=100;m=3000;
M_sv=Vector{SVector{n,Float64}}(undef,m);
V_re = reinterpret(Float64,M_sv);
M_re=reshape(V_re, (n,m));
M_uw = unsafe_wrap(Array,reinterpret(Ptr{Float64},pointer(M_sv)), (n,m));
rand!(M_uw);
and now we can give a partial answer to 2: Certain operations are slow on ReinterpretArray between certain types on certain julia versions. Somehow I believed that the below had gotten patched in the meantime, but it is slow on the archlinux 1.0.0 version, so let’s see:
@btime sum(sum($M_sv)); #102.901 μs (3 allocations: 1.77 KiB)
@btime sum($V_re);#7.836 ms (0 allocations: 0 bytes)
@btime sum($M_re);#8.544 ms (0 allocations: 0 bytes)
@btime sum($M_uw);#103.828 μs (0 allocations: 0 bytes)
@btime *(M_re',M_re);
res=Matrix{Float64}(undef,m,m);
@btime mul!(res,M_uw',M_uw);# 51.057 ms (1 allocation: 16 bytes)#51.057 ms (1 allocation: 16 bytes)
@btime mul!(res,M_re',M_re);#50.932 ms (1 allocation: 16 bytes)
So the answer to (2) is that reinterpret
on arrays produces a ReinterpretArray
(that you may need to reshape). This makes access sometimes slower (hopefully already/soon fixed in many cases), could sometimes give longer compile times, and may give you trouble if you or your dependecies dispatch on Array
instead of more abstract arrays. You or your dependencies might dispatch on Array because you want to hand your data over to C/Fortran/etc codes that expect a specific layout, or because you did not want to think about custom array types. We see that LinearAlgebra
appears to be capable of passing through the pointers of reinterpreted arrays to julia’s BLAS, but you’ll need to see whether your dependencies get this right.
Now, why is unsafe_wrap problematic? I see three issues:
- It is unstable. Internals and then all the answers to all these questions may change, and then your code produces wrong results.
- Type based aliasing analysis. This was the main reason for the new
reinterpret
(it used to be mostly equivalent tounsafe_wrap
). An array is essentially a pointer to a buffer with some metadata; and the compiler assumes that the buffers to differently typed buffers cannot overlap (alias). This makes code faster, but may produce wrong results if you shuffle data inside the same buffer, once accessed viaM_sv
and once viaM_uw
. So, don’t do that! - Relocation and object lifetime. In the above example,
M_sv
believes that it owns the buffer, and nobody else holds pointers to it. IfM_sv
dies, the buffer will befree
d and access toM_uw
can corrupt memory. If youpush!
toM_sv
, the buffer can berealloc
ed, and access toM_sv
can corrupt memory (so don’t push to stuff that has living wraps!).
Now, how do you use unsafe_wrap
safely? Well, don’t ever do anything that aliases in the same “context”, and make sure that the base lives at least one “context” longer than the unsafe_wrap
. unsafe_wrap
is very cheap: You can simply discard it after use and make a new wrap if you need it again (outside of inner loops). What does “context” mean? Well, as far as the compiler can see during optimization. I think that @noinline
function boundaries are enough separation? That is, code like the following should be OK for aliasing if foo!
and bar!
are @noinline
. Likewise, objects can be freed before execution reaches the line-number where they go out of scope, if the compiler infers that they are not used afterwards; but @noinline
function boundaries should prevent the compiler from noticing this?
I’d like @Keno’s confirmation on these points before anyone trusts me on this. I am not sure how far IPO looks; is it necessary to jump through further hoops to prevent the compiler from discovering the juicy but poisonous no-alias information?
@noinline f(M_uw, M_sv)
for i=1:10
foo!(M_uw)
bar!(M_sv)
end
end