Why does `reinterpret` cause an extra allocation?

I meant that he, as the audience, seemed to already be aware of the potential pitfalls of using low-level pointers, since he was already using them. He didn’t ever say that he intended to remove them all, if he could not do it efficiently.

My goal is to be able to program productively, which I generally can do much more effectively in Julia, but not have to drop into C or Rust unless I really have to (C++ is right out!).
Julia is frequently billed as being a performant language, and my goal lately has been trying to make that true for some areas that are important to my use cases, such as string handling.

I extensively benchmarked my code, before ending up switching to using pointers instead of trying to fool around with indices and lots of @inbounds and reinterprets.
The code ended up much simpler and faster.

Assembly language is still used in many places, where every cycle counts. It’s not for everybody though, and the cases where it is necessary are thankfully much less than they were say 10-20 years ago, before compilers started adding escapes that allowed access to low-level things like memory barrier and loadlocked/storeconditional instructions.

That’s not true for Rust, Julia or Swift. See the following:
Rust: pointer - Rust
Julia: https://docs.julialang.org/en/latest/base/c/#Base.pointer
Swift: UnsafePointer | Apple Developer Documentation
Python: ctypes — A foreign function library for Python — Python 3.12.0 documentation
Go also has pointers, but has more limitations on them (they can be nil, but you cannot do arithmetic on them), they mainly seem to be for holding only pointers received from other languages.

Javascript does not, but it was designed for running in a sandbox in a browser.

Modern languages generally have the premise that in most cases (not always!), you should use a managed reference, but they pretty much consistently allow you to bypass those restrictions, just as Julia does.

1 Like

Sorry for talking to myself here, but I’m just having too much fun with the new reinterpret. I mean, what does it do if one of the types has padding bytes?

Well, the answer is, I think: The same as wrap; it writes whatever lies around in memory (maybe this also depends on the alignment of the stars, eh, fields).

A2.=254;
struct bax
     a::Int32
     b::UInt8
end
Awr=reinterpret(bax, transpose(A2));
Awr[1]=bax(0,0);
Awr[2]=bax(0,0);
Awr[3]=bax(0,1);
for i=5:6 Awr[i]=bax(i,i);end
Awr[8]=bax(21,21);
 A2
8��8 Array{UInt8,2}:
 0x00  0x00  0x00  0x00  0x00  0x7f  0x00  0x00
 0x00  0x00  0x00  0x00  0x00  0x7f  0x00  0x00
 0x00  0x00  0x00  0x00  0x01  0x7f  0x00  0x00
 0xfe  0xfe  0xfe  0xfe  0xfe  0xfe  0xfe  0xfe
 0x05  0x00  0x00  0x00  0x05  0x00  0x00  0x00
 0x06  0x00  0x00  0x00  0x06  0x00  0x00  0x00
 0xfe  0xfe  0xfe  0xfe  0xfe  0xfe  0xfe  0xfe
 0x15  0x00  0x00  0x00  0x15  0x00  0x00  0x00


function wrap(::Type{T}, A::Array, ell=length(A)) where T
           ptr = convert(Ptr{T}, pointer(A))
           unsafe_wrap(Array, ptr, (ell,))
       end


A2.=254;
Aw=wrap(bax, A2, 8);

Aw[1]=bax(0,0);
Aw[2]=bax(0,0);
Aw[3]=bax(0,1);
for i=5:6 Aw[i]=bax(i,i);end
Aw[8]=bax(21,21)

A2
8��8 Array{UInt8,2}:
 0x00  0x00  0x00  0xfe  0x05  0x06  0xfe  0x15
 0x00  0x00  0x00  0xfe  0x00  0x00  0xfe  0x00
 0x00  0x00  0x00  0xfe  0x00  0x00  0xfe  0x00
 0x00  0x00  0x00  0xfe  0x00  0x00  0xfe  0x00
 0x00  0x00  0x01  0xfe  0x05  0x06  0xfe  0x15
 0x7f  0x7f  0x00  0xfe  0x00  0x00  0xfe  0x7f
 0x00  0x00  0x00  0xfe  0x00  0x00  0xfe  0x00
 0x00  0x00  0x00  0xfe  0x00  0x00  0xfe  0x00

Fun!

(I am not really sure whether disclosing random stack memory content is a good idea when writing into arrays of structs with padding; but I probably would not consider it a bug either [edit: I mean, that’s what C does, so meh]. Reproducing this in the new reinterpret takes dedication, but I am pretty sure I would not be able to do better or even propose a better alternative, except that maybe reinterpret for non-contiguous memory should not exist).

Great examples @foobar_lv2. It looks like ReinterpretArray is reporting itself as IndexCartesian, even when it’s just a vector. Try those examples again with this:

julia> Base.IndexStyle(::Type{R}) where {R<:Base.ReinterpretArray{<:Any,1}} = Base.IndexLinear()

julia> Base.IndexStyle(w2)
IndexLinear()

julia> @btime sum($w1)
 557.600 μs (0 allocations: 0 bytes)
NaN

julia> @btime copy($w1);
 3.873 ms (2 allocations: 7.63 MiB)

julia> @btime sum($w2)
 563.118 μs (0 allocations: 0 bytes)
NaN

julia> @btime copy($w2);
 2.071 ms (2 allocations: 7.63 MiB)

At least in this case, it looks to be an easy fix.

2 Likes

Does not fix the speed on my machine :frowning:

I’ll rebuild and try more systematically again later.
[edit: still not fixed for me.]

PS about the padding: Now I believe its mostly leftover register contents, not stack memory; but whatever happens probably depends a lot on how the compiler feels today. Well, the filling of the padding is UB, so whatever.

After pondering the source of reinterpret-array:

struct bax2
       a::Int8
       b::Int
       c::Int8
end

 A4=[bax2(-1,-1,-1)];
A4r=reinterpret(UInt8, A4);
@show A4r[2];
# A4r[2] = 0xfd
A4r[2]=2;
@show A4r[2];
#A4r[2] = 0xfd

I guess I’ll open an issue for that one, even though the fact that reinterpret goes via abstract array kinda implies this problem (it would be valid, I think, to have a “structure-packed array” where the padding bytes simply don’t exist and hence cannot be reinterpreted-- ie sizeof is not informative for abstract arrays that don’t guarantee a contiguous memory layout).

edit: Issue opened, https://github.com/JuliaLang/julia/issues/25908

It would be nice if there were also a way to get a reinterpreted view of an array. If you do

reinterpret(T, A[a:b])

then A[a:b] must be allocated, so this is not a view. One also cannot do

reinterpret(T, view(A, a:b))

for reasons that should be obvious (edit: only true in 0.6, see below). Therefore, the only way to get a reinterpreted view of an array in Julia right now is using pointers (unless I’m missing something).

julia> A = [1:10;]
10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

julia> B = view(A, 5:10)
6-element view(::Array{Int64,1}, 5:10) with eltype Int64:
  5
  6
  7
  8
  9
 10

julia> C = reinterpret(Float64, B)
6-element reinterpret(Float64, view(::Array{Int64,1}, 5:10)):
 2.5e-323
 3.0e-323
 3.5e-323
 4.0e-323
 4.4e-323
 5.0e-323

julia> C[1] = 1.0
1.0

julia> A
10-element Array{Int64,1}:
                   1
                   2
                   3
                   4
 4607182418800017408
                   6
                   7
                   8
                   9
                  10
1 Like

Cool. This is on 0.7 I take it?

yes

Just compared unsafe_wrap to reinterpret of a view on 0.7…

unsafe_wrap does 29 ns, reinterpret of a view does 20ns!

I’m very excited because now this means I can definitely write Arrow.jl with full memory safety!

Awesome job Julia devs! :tada:

3 Likes

I’ve had problems where accessing the reinterpreted object is slower than a normal Array. Do you have any benchmarks for that? https://github.com/JuliaLang/julia/issues/25014

Yes, unfortunately you are right. My median time for access to an array wrapped from pointers is 2.6 ns, for a reinterpreted view I get a median time of 10.0 ns. I’ll comment on that thread.