Reinterpret as a complex struct

Hi! There is a dataset loaded by a 3rd-party library in the form of a raw float array. I want to view this data as a vector of structs A, each having layout like this: <F32><F32><F32>(B*8*60*3*4)<F32>, i.e. three floats, then a 8*60*3*4 array of structs B and then another float. Struct B is simply three floats.
However I cannot find a reasonable way to do this using reinterpret or otherwise. reinterpret requires bitstype, so something like StaticArrays for the inner arrays. However, StaticArrays has poor performance, especially compilation part, for arrays that large.
Any other way to view a raw buffer array as an array of large nested structs?

I am not sure I understand the specs, but what about

data = Float64.(1:(3 + 3*8*60*3*4 + 1))

struct B{T}
    a::T
    b::T
    c::T
end

header = data[1:3]
Bs = reinterpret(B{Float64}, @view data[4:(end - 1)])
footer = data[end]

That said, a static array type instead of B{Float64} should be fine too.

I mean something more along these lines:

struct B
  a::Float32
  b::Float32
  c::Float32
end

struct A
  a::Float32
  b::Float32
  c::Float32
  d::StaticArray{Tuple{3,8,60,3,4}, B}
  e::Float32
end

reinterpret(A, data)

This basically works, but many operations are slow due to the large staticarray.

I don’t understand the motivation to use static arrays like this, this is known to be suboptimal.

Sure, this was just to explain what I need. Replacing StaticArray with Array does not work here, because reinterpret requires bits type.

Is there a reason you can’t do what I suggested above?

I would really like to use the resulting array x as x[i].a, x[i].d[a, b, c, d, e] and so on. As I understand, your code handles a single element of this (single A struct), and cannot be applied to the whole array of them.

I don’t think what you’re asking for is possible without gigantic static arrays (or an equivalent giant bitstype) because the memory layout of the natural non-bitstype julia structs would simply be different than the layout of your vector.

But there may still be a way to get what you want. If your goal is just to avoid expensive copying, then perhaps you can construct each d field from a reinterpret(B, view(data, i, j)) where i and j are the relevant indices for each A’s data. You’d still need to copy the extra floats in each A, but that should be very cheap.

1 Like

Thanks! This is probably the closest possible solution. However, copying the floats has an effect that when they are modified by assigning to a field in the resulting array (of mutable structs), these changes do not propagate to the original data buffer. Currently this is good enough for me, but I’m still interested if someone can think of a way to make a true view of a raw buffer array as an array of complex structs like in this example.

The most realistic variant is something like

julia> struct arrPtr<:AbstractArray{Float32, 4}
       ptr::Ptr{Float32}
       end

julia> Base.size(::arrPtr) = (8, 60, 3, 4)
julia> Base.getindex(a::arrPtr, i) = unsafe_load(a.ptr, i)
julia> Base.setindex!(a::arrPtr, x, i) = unsafe_store!(a.ptr, x, i)
julia> Base.IndexStyle(::Type{arrPtr}) = Base.IndexLinear()

julia> struct Bptr
       ptr::Ptr{Float32}
       end

julia> function Base.getproperty(b::Bptr, s::Symbol)
       ptr = getfield(b, 1)
       if s == :a 
         return unsafe_load(ptr, 1)
       elseif s==:b
         return unsafe_load(ptr, 2)
       elseif s==:c
         return unsafe_load(ptr, 3)
       elseif s==:d
         return arrPtr(ptr + 12)
       elseif s==:e
         return unsafe_load(ptr, 3 + 5760)
       else
         error()
       end
       end

julia> function Base.setproperty!(b::Bptr, s::Symbol, v::Float32)
       ptr = getfield(b, 1)
       if s == :a 
         return unsafe_store!(ptr, v, 1)
       elseif s==:b
         return unsafe_store!(ptr, v, 2)
       elseif s==:c
         return unsafe_store!(ptr, v, 3)
       elseif s==:e
         return unsafe_store!(ptr, v, 3 + 5760)
       else
         error()
       end
       end

julia> struct asBptr<:AbstractVector{Bptr}
       ptr::Ptr{Float32}
       len::Int
       keepalive::Any
       end
julia> asBptr(arr::Array{Float32}) = asBptr(pointer(arr), div(length(arr), 16+4*5760), arr)
julia> Base.size(buf::asBptr) = (buf.len,)
julia> Base.getindex(buf::asBptr, i)  = Bptr(buf.ptr + (16 + 4*5760)*(i-1))
julia> Base.IndexStyle(::Type{asBptr}) = Base.IndexLinear()

This gives use like

julia> ab = asBptr(arr)
43-element asBptr:
 Bptr(Ptr{Float32} @0x00007f3acb016040)
 Bptr(Ptr{Float32} @0x00007f3acb01ba50)
...
julia> ab[14].a
0.4085027f0
julia> ab[14].d[:, 2, 2, 2]
8-element Array{Float32,1}:
 0.15881789
 0.41884196
 0.07280159
 0.8976873 
 0.97513235
 0.2145493 
 0.5796931 
 0.08459842

julia> ab[14].d[:, 2, 2, 2].=0;

This kind of approach is effectively without alternative if you deal with a memory mapped file and mutations need to be visible to other processes.

Note that there are no boundschecks. asBptr keeps the underlying storage alive, but Bptr or arrPtr don’t (so make sure that the garbage collector does not steal your storage away!).

1 Like

However, this is not type stable, right? Due to the if block in getproperty returning different types that depend on value of s passed.

If a function like that getproperty method above is called with a constant (as is the case when you do foo.bar), the compiler can propagate that constant through the function and figure out the return type, even in cases that look type-unstable. For example:

julia> function looks_type_unstable(s::Symbol)
         if s == :a
           1.0
         else
           "hello"
         end
       end
looks_type_unstable (generic function with 1 method)

julia> function passes_a_constant()
         looks_type_unstable(:a)
       end
passes_a_constant (generic function with 1 method)

julia> @code_warntype(passes_a_constant())
Body::Float64
1 ─     return 1.0
2 Likes