I am trying to define my own primitive type of bit-length 8*n for some n, e.g. below I have n=3.
The functions lshr_int and shl_int works directly on the newly-defined type but not bswap_int. I tried to check its definition using @which bswap_int(UInt(888)) and I see that it’s an intrinsic function so I can’t look at its implementation for UInt and try to adapt it.
I can build my own bswap_int using lshr_int and shl_int and | but is there a more efficient way?
primitive type UInt24 <: Unsigned 24 end
x = unsafe_load(Ptr{UInt24}(pointer("abc")))
# bitshifts work fine
Base.lshr_int(x, 8)
Base.shl_int(x, 8)
# this will crash
Base.bswap_int(x)
The background is that I am trying to build a more efficient string radixsort so being able to load the underlying bits of various length efficiently is key.
I’m kind of unclear on what the benefit of a 24-bit integer type is. It doesn’t save space in registers and doesn’t save storage on disk unless you sacrifice decent alignment altogether which doesn’t seem worth it.
it was just an example. i was trying to make a type that can load 3 bytes
at once from a string using unsafe_load. Also SAS has a 24bit numeric type
of for reading sas data this might become helpful.
A number of people have brought up their use cases for 24-bit numbers (such as representing RGB colors)
Using 33% more space (in memory or on disk) can end up affecting performance more than alignment issues (which generally aren’t even issues on most processors these days).
Better to actually get real evidence rather than stating opinions without the data to back them up.
If you can guarantee that reading one byte past the end is safe (which isn’t that hard to do, by simply allocating a buffer one byte larger than needed), then you can simply do an unaligned 32-bit read and a mask faster than doing loading 3 bytes individually.
Even when you can’t, I’ve seen that LLVM optimizes a 24-bit load or store into two operations, a 16-bit one and an 8-bit one.