Slowdown with reinterpret

I hope I can be forgiven for being surprised at the starkly difference between these implementations.

using BenchmarkTools
function myfunc1(n::Int)
  x = Vector{Int}(undef, 3*n)
  for i in eachindex(x)
    x[i] = i
  end 
  return x
end 

function myfunc2(n::Int)
  xdata = Vector{Tuple{Int,Int,Int}}(undef, n)
  x = reinterpret(Int, xdata)
  for i in eachindex(x)
    x[i] = i
  end 
  return xdata
end 

n = 1_000_000
@btime myfunc1($n);
@btime myfunc2($n);

Results

  3.068 ms (2 allocations: 22.89 MiB) # myfunc1
  16.018 ms (2 allocations: 22.89 MiB) # myfunc2 

Is there any way to reinterpret a vector of tuples as a linear array without such a slowdown? Is this a bug somewhere?

Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 16 × Intel(R) Xeon(R) W-2140B CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake-avx512)
  Threads: 1 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 
1 Like

I think this slowdown might be resolved by

but that PR hasn’t seen a lot of attention in a while

This package may be able to address your case:

1 Like

You can cast the memory region. However, notice that this is somewhat unsafe and you will need to take care for Julia GC do not deallocate xdata:

function myfunc3(n::Int)
    xdata = Vector{Tuple{Int,Int,Int}}(undef, n)
    GC.@preserve xdata begin
        p = Base.unsafe_convert(Ptr{Int}, xdata)
        x = unsafe_wrap(Array, p, 3 * n)

        for i in eachindex(x)
          x[i] = i
        end
    end
    return xdata
end
julia> @btime myfunc1($n);
  2.484 ms (2 allocations: 22.89 MiB)

julia> @btime myfunc2($n);
  16.915 ms (2 allocations: 22.89 MiB)

julia> @btime myfunc3($n);
  2.487 ms (3 allocations: 22.89 MiB)
1 Like

Thanks! I was hoping to avoid the unsafe convert, but that seems like a pragmatic way to proceed. Especially with the @preserve.

2 Likes