Memory aliasing rules and reinterpret

foobar_lv2 · February 7, 2024, 2:14pm

I wanted to ask about the julia aliasing rules for memory accesses. Unfortunately, the docs are not super clear about that, e.g.

Julia currently uses LLVM’s Type Based Alias Analysis.
To find the comments that document the inclusion relationships, look for static MDNode* in src/codegen.cpp

The ultimate reason for my question is the following construction: It often makes sense to view the same chunk of memory as either v::Vector{SVector{N, T}} or as mat::Matrix{T} with size(mat) = (N, length(vec)), such that v[idx] = mat[:, idx].

Base Reinterpret is inappropriate for this – performance simply sucks / is unreliable.

The classic solution is something like

v = ccall(:jl_reshape_array, Vector{SVector{size(mat)[1], eltype(mat)}}, (Any, Any, Any), Vector{SVector{size(mat)[1], eltype(mat)}}, mat, (size(mat)[2],))

Now my question is: Are reads/writes to v and to mat permitted to alias in julia? In theory and in practice?

Is there a programmatic way of finding out whether something is permitted to alias per TBAA?

The next question would be: Is TBAA applied to unsafe_store! / unsafe_load?

If yes, could we maybe have an extra argument to Core.pointerref et al to say that a load/store is intended to have ccall(:memcpy, ...) semantics?

PS. It looks like they are allowed to alias in practice?

julia> mat = rand(Int64, 2, 2)
2×2 Matrix{Int64}:
 6764168577710771209  -7368470990085874290
 7379517068019859522     36090219286728105

julia> v=ccall(:jl_reshape_array, Vector{SVector{size(mat)[1], eltype(mat)}}, (Any, Any, Any), Vector{SVector{size(mat)[1], eltype(mat)}}, mat, (size(mat)[2],))
2-element Vector{SVector{2, Int64}}:
 [6764168577710771209, 7379517068019859522]
 [-7368470990085874290, 36090219286728105]

julia> function baz(v, mat)
       @inbounds v0 = v[1]
       @inbounds mat[1] += 1
       @inbounds v1 = v[1]
       v1==v0
       end
baz (generic function with 1 method)

julia> baz(v,mat)
false

Is this guaranteed behavior that will not change before 2.0 ? Or is this code UB that the compiler is just not smart enough to break?

PS. Likewise,

julia> mat2=rand(2,2)
2×2 Matrix{Float64}:
 0.932218  0.839201
 0.810208  0.952793
julia> x=ccall(:jl_reshape_array, Matrix{Int64}, (Any, Any, Any), Matrix{Int64}, mat2, size(mat2))
2×2 Matrix{Int64}:
 4611380756684956808  4605734070903863065
 4605472925120407489  4606757212508987398

julia> baz(x, mat2)
false

Is this code UB or is it valid and will return false until julia 2.0?

(cc @Keno because something along these lines triggered your ReinterpretArray rewrite years ago)

Keno · February 7, 2024, 6:36pm

No, this is UB, that’s why ReinterpretArray exists.

No, I had a PR to add it at some point, but we never finished that.

foobar_lv2 · February 7, 2024, 7:42pm

Thanks a lot!

Can you confirm again that the first example (alias between Float64 and SVector{N,Float64}) is really UB? I think aliasing would be valid in C/C++, and the manual suggested that julia-tbaa is modelled after C, so that’s why I’m asking.

That doesn’t really answer the question: Can we trust that until julia 2.0, the following will always return false?

julia> function foo(a::Vector{Float64}) 
       x= unsafe_load(convert(Ptr{Int},pointer(a)))
       @inbounds a[1] += 1
       y = unsafe_load(convert(Ptr{Int},pointer(a)))
       x == y
       end
foo (generic function with 2 methods)

julia> foo([0.0, 0.1])
false

Is there a non-silly way of doing pointer-punning?

With special emphasis on the “harmless” case, where both views of the memory are ultimately the same homogeneous primitives but have different interpretations with respect to “this is an array of NTuple{4,Int}” vs “this is an array of Int”.

Due to standards committee institutional dysfunction, C is stuck with the incredibly stupid memcpy idiom that ReinterpretArray uses, but we shouldn’t need to repeat that in julia – for example, we could have a Core.pointerref that takes the desired TBAA / aliasing tags as input.

We could also have a PunnedPtr{TypeForDereferencing, TypeForTBAA}. Well, I think I cannot implement it without a better Core.pointerref?

Keno · February 7, 2024, 7:55pm

Yes, no TBAA on unsafe_load as I said

Yes, that was the never-merged PR I mentioned.

foobar_lv2 · February 7, 2024, 8:28pm

Yes, no TBAA on unsafe_load as I said

Cool, thanks – I misunderstood you on that, and thought that TBAA on unsafe_load was the “not yet merged” PR.

So the canonical way of doing vector-of-svector / matrix punning for performance-relevant code would be e.g. the same as unsafe_reinterpret(mat) = Random.UnsafeView(convert(Ptr{SVector{size(mat, 1), eltype(mat)}}, pointer(mat)), size(mat,2))?

And we may at some time get precise control over aliasing in pointerref / pointerset if your never-merged PR gets in? Would be cool!

If possible also atomics on that line. An array of atomics is currently not so well supported. Could need to be restricted to Matrix to solve concurrency issues with realloc (afaiu current canonical implementation would be pointer-based and use llvmcall to replace the missing atomics support in pointerref/pointerset?)

Another important alias annotation I’d like in that context is “owns the memory / pointers derived from this can only alias with other pointers derived from this”. I.e. maximally simple way to tell the compiler that yes, these two matrices of Float64 do not alias, on pain of UB. The required support would only be extra options to Core.pointerset / Core.pointerref, from that packages could go the way.

Keno · February 7, 2024, 8:31pm

Obviously you need to take care of memory ownership there and also, the lack of TBAA of course implies that you’re pessimizing all other memory accesses in the same function, so some performance care is required.

Mason · February 7, 2024, 8:42pm

By the way @foobar_lv2, I know this isn’t the wider point of the thread (and I also want to thank you for making the thread, because I also have often wanted clarification on this exact question of unsafe_load / unsafe_store), but are you aware of HybridArrays.jl? It’s pretty much perfect for the example use-case you described.

julia> using HybridArrays, StaticArrays

julia> A = HybridArray{Tuple{4, StaticArrays.Dynamic()}, Float64}(undef, 4, 10);

julia> A .= reshape(1:40, 4, 10) # initialize some values
4×10 HybridMatrix{4, StaticArraysCore.Dynamic(), Float64, 2, Matrix{Float64}} with indices SOneTo(4)×Base.OneTo(10):
 1.0  5.0   9.0  13.0  17.0  21.0  25.0  29.0  33.0  37.0
 2.0  6.0  10.0  14.0  18.0  22.0  26.0  30.0  34.0  38.0
 3.0  7.0  11.0  15.0  19.0  23.0  27.0  31.0  35.0  39.0
 4.0  8.0  12.0  16.0  20.0  24.0  28.0  32.0  36.0  40.0

julia> A[:, 1]
4-element SVector{4, Float64} with indices SOneTo(4):
 1.0
 2.0
 3.0
 4.0

julia> @btime sum($A[:, 1])
  2.174 ns (0 allocations: 0 bytes)
10.0

foobar_lv2 · February 7, 2024, 8:56pm

No, I wasn’t. That looks pretty neat!

I’ll try that out and see whether it has good ergonomics for me.

Mason · February 7, 2024, 9:22pm

One other note on this thread, we actually do document that unsafe_load is exempt from TBAA, though the statement is rather oblique and easy to miss:

help?> unsafe_load
[...]
 Unlike C, dereferencing memory region allocated as different type may be
  valid provided that the types are compatible.
[...]

Maybe if someone wants to make a docs PR, this statement could be made a bit more clear, and the typo could be fixed.

Topic		Replies	Views
Getting around the strict aliasing rule in Julia Performance	1	163	June 30, 2025
Why does `reinterpret` cause an extra allocation? General Usage	30	4155	February 24, 2018
Use of `pointer` General Usage	19	6946	February 6, 2018
Big overhead with the new lazy reshape/reinterpret Internals & Design	35	4958	August 18, 2018
FAQ: ReinterpretArray vs unsafe_wrap General Usage	2	1729	September 4, 2018

Memory aliasing rules and reinterpret

Related topics