And then still in the end: isn’t the incredibly poor performance of reinterpreted arrays on J1.8 strange when bounds-checking is enabled?
I think as long you keep the reference to the original Vector
you are safe to use both, the GC will not free the memory exactly because you keep the original reference.
this is truly unsafe and best performance because it gives you a native array
julia> unsafe_arraycast(Float64, rand(UInt8, 64))
8-element Vector{Float64}:
-6.079564859434036e242
2.8652427427119243e252
-7.940145865008032e-108
8.320185091615792e-8
5.427223515701773e188
9.184586067914578e-204
2.342950753478369e-31
2.653381005394009e266
nope,
jl_reshape_array
handles that
I think no, as the reference to the memory is the same. You can check it with
julia> function unsafe_arraycast(::Type{D}, ary::Vector{S}) where {S, D}
l = sizeof(S)*length(ary)÷sizeof(D)
res = ccall(:jl_reshape_array, Vector{D}, (Any, Any, Any), Vector{D}, ary, (l,))
return res
end
unsafe_arraycast (generic function with 1 method)
julia> A = zeros(UInt8, 10 * sizeof(Float64));
julia> B = unsafe_arraycast(Float64, A);
julia> pointer(A)
Ptr{UInt8} @0x00007f744d5eba28
julia> pointer(B)
Ptr{Float64} @0x00007f744d5eba28
That’s really nice - thanks for the suggestion
Why do you label it unsafe then?
because it is super unsafe and Julia devs strongly against even having this as unsafe_*
function in the base.
Notice this doesn’t work before 1.7 and is likely to break again in the future when jl_reshape_array
changes
In what sense is it “super unsafe”? Is if because something internals of Julia? Memory aliasing with different types?
Weird, I’m testing on v1.7
and it looks to work just fine.
before 1.7 means 1.6 doesn’t work
Sorry, misread.
Oky, doky.
Basically, julia “knows” that 2 different arrays contain unrelated content, so it might optimize by removing any code that doesn’t meet that expectation. But it will only do it sometimes in rare cases, and usually only when you start to rely upon it in a larger application, or if you end up someday trying to use the same code on a new computer. The penalty you see with using ReinterpretArray (which is the same, but disables this optimization assumption) is exactly why we don’t make this the default.
thank you. My application is pretty large - this is a tiny but very important component.
So whatever approach I use, if I keep the original reference but NOT NEVER ACCESS IT, then I’ll definitely be fine?
You might be sometimes, other times you might not. The original UInt8 array might not have the alignment required for the processor to load the values correctly, or the application might just sometimes suffer significant performance penalties on many processors.
Ok, I finally understand the problem. And is there no way to guarantee that the UInt8
array is contiguous in memory?
That is what declaring it Float64
is for?
Concretely the way I want to use this is to have temporary arrays for cases where I can predict their eltype
at evaluation time, but not at the time I’m constructing the object where the temporary array is held. So very crudely, I would do something like this:
mutable struct A
tmp::Vector{UInt8}
# other stuff
end
function evaluate(a::A, x::TX)
T = predict_eltype(TX)
# resize a.tmp if necessary, or replace with larger array
tmp = reinterpret(T, a.tmp)
# ... do some computations in tmp
# return output
end
But then I noticed that the reinterpret
sometimes causes a significant slow-down.
I should also acknowledge that the speedup I’m getting by not allocating is not at all clear, sometimes I seem to gain quite a bit, sometimes nothing. Maybe it depends on whether the GC can figure out that I want to reuse my arrays and keeps them around? (Basically that’s the functionality I’m trying to re-implement here…)
Note that this will improve considerably with Make `StridedReinterpretArray`'s `get/setindex` pointer based. by N5N3 · Pull Request #44186 · JuliaLang/julia · GitHub
nightly:
julia> print(" Array: "); @btime cheb!($A, $x)
Array: 225.605 ns (0 allocations: 0 bytes)
julia> print("ReinterpretArray: "); @btime cheb!($B, $x)
ReinterpretArray: 2.839 μs (0 allocations: 0 bytes)
julia> VERSION
v"1.9.0-DEV.411"
This PR:
julia> print(" Array: "); @btime cheb!($A, $x)
Array: 226.522 ns (0 allocations: 0 bytes)
julia> print("ReinterpretArray: "); @btime cheb!($B, $x)
ReinterpretArray: 458.822 ns (0 allocations: 0 bytes)
julia> VERSION
v"1.8.0-DEV.1561"
Wish this PR would get reviewed