I would like to take the bits of one struct and reinterpret them as the bits of another type. Semantically the following does the desired job:
function check_bitcast(T, s)
S = typeof(s)
isbitstype(T) || throw(ArgumentError("Can only cast into bitstype."))
isbitstype(S) || throw(ArgumentError("Can only cast from bitstype."))
sizeof(T) == sizeof(S) || throw(ArgumentError("Can only cast between types of equal size."))
end
function bitcast_slow(T, s)
check_bitcast(T, s)
arr = reinterpret(T, [s])
first(arr)
end
However it is too slow (see below). The following seems to achive the same, but faster:
function unsafe_bitcast(::Type{T}, s::S) where {T, S}
rt = Ref{T}()
rs = Ref{S}(s)
GC.@preserve rt rs begin
pt = Ptr{UInt8}(Base.unsafe_convert(Ref{T}, rt))
ps = Ptr{UInt8}(Base.unsafe_convert(Ref{S}, rs))
Base._memcpy!(pt, ps, sizeof(T))
end
return rt[]
end
function bitcast(::Type{T}, s::S) where {T, S}
check_bitcast(T, s)
unsafe_bitcast(T, s)
end
struct F64; value::Float64; end
struct U8; value::UInt8; end
using BenchmarkTools
t = ntuple(U8, 8)
@assert bitcast(F64, t) === bitcast_slow(F64, t)
@btime bitcast($F64, $t) # 1.297 ns (0 allocations: 0 bytes)
@btime bitcast_slow($F64, $t) # 32.021 ns (2 allocations: 128 bytes)
x = 3.14159
b = reinterpret(UInt64, x)
b % UInt8 # first (least significant) byte
rather than messing around with pointers?
What are you trying to accomplish here by a bitcast? If you just want to write/read raw bytes to/from a stream, you can use write and read, for example.
Because this works only for a few builtin types. My main interest is converting NTuple{N,UInt8} into a custom struct.
struct F64; value::Float64;end
reinterpret(F64, 1)
# throws bitcast: target type not a leaf primitive type
Really I want to mmap a file, that contains nested structs in a “packed” memory layout. I would like to mirror the packed memory layout by using a julia struct that contains a private tuple of bytes UInt8 and does lots of getproperites overloading etc.
Yeah I have implemented that a long time ago and it is what I currently use. However often I don’t care about the full struct. I just want to compute statistics over one or two fields. In this case it is a waste to “unpack” the full struct. When using the mmap approach I only have to pay for the fields that I am actually using. With a slightly different format I got 2x speedup by doing this.
Sure it is possible and I thought about doing this. But I think reading whole structs would give a much nicer high level API. Here is what I have in mind (the structs are particles and the file format is IAEA/EGS phase space format):
You can provide whatever API you want on top of a seekable file io stream just as you could with an mmap array; in either case you would be wrapping it in some object with accessor functions. In your case, it seems like a stream object may be more convenient since you need to read heterogeneous types?
So for each file there is a header that encodes the struct layout. All particles in a single file then have this layout. But different files may have different layout. So the above example has the following features:
I can throw functions at it that accept a Particle struct and don’t need to write special functions that accept ParticleStream struct.
Under the hood for most particles only the field that decides whether the particle is an electron is decoded.
Under the hood for the electrons the energy field is the only further decoded field.
Also thanks for all the feedback @stevengj. So I get, that you think the bitcast solution is bad. Can you maybe comment why it is bad? In particular is a bitcast function by itself bad, or is it my implementation?