Is this bitcast function sane and safe?

I would like to take the bits of one struct and reinterpret them as the bits of another type. Semantically the following does the desired job:

function check_bitcast(T, s)
    S = typeof(s)
    isbitstype(T) || throw(ArgumentError("Can only cast into bitstype."))
    isbitstype(S) || throw(ArgumentError("Can only cast from bitstype."))
    sizeof(T) == sizeof(S) || throw(ArgumentError("Can only cast between types of equal size."))
end

function bitcast_slow(T, s)
    check_bitcast(T, s)
    arr = reinterpret(T, [s])
    first(arr)
end

However it is too slow (see below). The following seems to achive the same, but faster:

function unsafe_bitcast(::Type{T}, s::S) where {T, S}
    rt = Ref{T}()
    rs = Ref{S}(s)
    GC.@preserve rt rs begin
        pt = Ptr{UInt8}(Base.unsafe_convert(Ref{T}, rt))
        ps = Ptr{UInt8}(Base.unsafe_convert(Ref{S}, rs))
        Base._memcpy!(pt, ps, sizeof(T))
    end
    return rt[] 
end

function bitcast(::Type{T}, s::S) where {T, S}
    check_bitcast(T, s)
    unsafe_bitcast(T, s)
end

struct F64; value::Float64; end
struct U8; value::UInt8; end

using BenchmarkTools
t = ntuple(U8, 8)
@assert bitcast(F64, t) === bitcast_slow(F64, t)

@btime bitcast($F64, $t) #   1.297 ns (0 allocations: 0 bytes)
@btime bitcast_slow($F64, $t)  #   32.021 ns (2 allocations: 128 bytes)

Is the fast implementation sane + correct + safe?

1 Like

Why not just do

x = 3.14159
b = reinterpret(UInt64, x)
b % UInt8 # first (least significant) byte

rather than messing around with pointers?

What are you trying to accomplish here by a bitcast? If you just want to write/read raw bytes to/from a stream, you can use write and read, for example.

1 Like

Because this works only for a few builtin types. My main interest is converting NTuple{N,UInt8} into a custom struct.

struct F64; value::Float64;end
reinterpret(F64, 1)
# throws bitcast: target type not a leaf primitive type

Really I want to mmap a file, that contains nested structs in a “packed” memory layout. I would like to mirror the packed memory layout by using a julia struct that contains a private tuple of bytes UInt8 and does lots of getproperites overloading etc.

Why not just read directly into corresponding struct types? Why use an NTuple{N,UInt8} at all?

2 Likes

Yeah I have implemented that a long time ago and it is what I currently use. However often I don’t care about the full struct. I just want to compute statistics over one or two fields. In this case it is a waste to “unpack” the full struct. When using the mmap approach I only have to pay for the fields that I am actually using. With a slightly different format I got 2x speedup by doing this.

The same is possible for ordinary file I/O: look up seek and skip

Sure it is possible and I thought about doing this. But I think reading whole structs would give a much nicer high level API. Here is what I have in mind (the structs are particles and the file format is IAEA/EGS phase space format):

using PhaseSpaceIO, Transducers, OnlineStats

particles = load("huge.IAEAphsp")
xf = Filter(iselectron) |> Map(energy) |> Take(10^7)
estimate!(Histogram(), xf, particles)
1 Like

You can provide whatever API you want on top of a seekable file io stream just as you could with an mmap array; in either case you would be wrapping it in some object with accessor functions. In your case, it seems like a stream object may be more convenient since you need to read heterogeneous types?

1 Like

So for each file there is a header that encodes the struct layout. All particles in a single file then have this layout. But different files may have different layout. So the above example has the following features:

  • I can throw functions at it that accept a Particle struct and don’t need to write special functions that accept ParticleStream struct.
  • Under the hood for most particles only the field that decides whether the particle is an electron is decoded.
  • Under the hood for the electrons the energy field is the only further decoded field.

How would I do that with the seekable io?

Also thanks for all the feedback @stevengj. So I get, that you think the bitcast solution is bad. Can you maybe comment why it is bad? In particular is a bitcast function by itself bad, or is it my implementation?