What am I doing wrong with `unsafe_wrap()`?

Trying to write some test code with unsafe_wrap(), and getting unexpected garbage results.

julia> function f()
          r = Ref(tuple(5.0, 2.0, 1.0, 6.0))
          p = Base.unsafe_convert(Ptr{Float64}, r)
          u = unsafe_wrap(Array, p, 4)
          display(u)
          return nothing
       end
f (generic function with 1 method)

julia> f()
4-element Vector{Float64}:
 1.20926195e-315
 6.341044e-317
 1.4669e-319
 2.12443504e-316

Why is it not printing an array of [5.0, 2.0, 1.0, 6.0]?

You didn’t root r here, so it likely got garbage collected before the display call. You need to wrap all code which references r by pointer with GC.@preserve:

julia> function f()
          r = Ref(tuple(5.0, 2.0, 1.0, 6.0))
          GC.@preserve r begin
              p = Base.unsafe_convert(Ptr{Float64}, r)
              u = unsafe_wrap(Array, p, 4)
              display(u)
          end
          return nothing
       end
f (generic function with 1 method)

julia> f()
4-element Vector{Float64}:
 5.0
 2.0
 1.0
 6.0

Also note that you can’t leak u from this function without also ensuring r is rooted somehow. I’d recommend checking out the ffi section of the manual:

https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/

2 Likes

Similarly, you could just return r.

julia>        function f()
                 r = Ref(tuple(5.0, 2.0, 1.0, 6.0))
                 p = Base.unsafe_convert(Ptr{Float64}, r)
                 u = unsafe_wrap(Array, p, 4)
                 display(u)
                 return r
              end

julia> f();
4-element Vector{Float64}:
 5.0
 2.0
 1.0
 6.0

It’s still recommended to use GC.@preserve here. With further compiler optimizations in the very near future, these kinds of allocations could still be eliminated if r doesn’t escape from the caller, for example if f is called from another function.

2 Likes

I don’t think it works even with @preserve, outside of the function, Julia doesn’t know it needs to preserve r in order to keep u, because u doesn’t contain reference to r anymore.

you need to finish dealing with both u and r within the same @preserve block, which means you can’t return u and expect the underlying content to stay around.


other-wise we wouldn’t have struggled with: https://github.com/JuliaLang/julia/issues/42227

1 Like

side comment: unfortunately this is one of very rare occasion where the task is so trivial in C/C++ (“just unsafe cast the pointer trust me”), and impossible to do in Julia because we can’t manually mark/unmark new root object for GC.

Essentially it’s impossible to write a function:

f(T, arr::Vector{W}) where {T,W}

that takes an arr, “unsafe wrap” it into a Vector of different type, and return a built-in, clean Vector{T}.

You need to make something like a:

struct MyVector{T, W}
newarr::Vector{T}
ref::Ref{Vector{W}}
end

now, I just realized in this case, OP has a Ref{Tuple}, maybe that makes things different because Tuple is immutable, but it seems like GC still cares

GC.@preserve is required. Consider

function g()
    f()
    nothing
end

When f is inlined, r is not returned anymore and Julia and LLVM are not required to preserve r.

Besides, unless own = true is passed, the programmer is responsible for keeping the memory behind the pointer p alive. GC.@preserve is virtually the only way to do it if the memory region is allocated as a Julia object. But I think most of the confusion is coming from the lack of documentation. We need to have better documentation on what is allowed (which may actually need to start from discussing what should be allowed).

(There’s also a question that if this kind of type-punning is actually OK to do. The compiler can, in principle, start using Julia-level type information more aggressively to reason about aliasing.)

It may be straightforward to do type-punning in C but I wouldn’t say it’s trivial to do in C++ when you have various hours-long talks on how to do it properly in C++ (e.g., Type punning in modern C++ - Timur Doumler - CppCon 2019 - YouTube). Like C++, Julia needs to provide some constructs for doing this in a type system-compatible manner. C++ got std::bit_cast only in C++20 so it’s not like we are very behind. ReinterpretArray essentially does what C++ does but there’s no public API for doing this properly.

3 Likes

I would think at least for “plain” “isbittype” vector, it’s trivial to cast with the same set of underlying bytes in C or C++, which is really the most useful case because it’s probably some kind of I/O or binary data format processing task

1 Like

It’s true that if you stray outside Julia’s model of ownership things do get messy and manual.

With ResourceContexts.jl I tried to tackle some of these problems. For example we have the following entertaining example of returning a bare Ptr to a temporary buffer, at the cost of using the @! macro:

@! function raw_buffer(len)
    buf = Vector{UInt8}(undef, len)
    @defer GC.@preserve buf nothing
    pointer(buf)
end

@context begin
    len = 1_000_000_000
    ptr = @! raw_buffer(len)
    GC.gc() # `buf` is preserved regardless of this call to gc()
    unsafe_store!(ptr, 0xff)
end

Those macros are ugly to be sure! I hope one day we can have some more official and builtin variant of this (or other equivalent) kind of thing which works with the language’s natural scoping rules.

1 Like

It’s not really that simple. Type punning violates the C++ strict aliasing rules, and compilers will be happy to interpret your code in a way you didn’t intend. See for example c++ - What is the strict aliasing rule? - Stack Overflow

3 Likes

Casting between different pointer types in general are UB, even in C, IIUC. But you can always cast to bytes (char*). So, the basic strategy to reinterpret T to S is to cast their pointers as pointers to bytes, memcpy the bytes, and load S. I’d imagine this is why std::bit_cast returns a value and not a reference (to help TBAA). That’s the strategy I followed in https://github.com/JuliaLang/julia/pull/43065

2 Likes