I am developing a library to manipulate protocol messages in Julia.
It must be able to store an incoming message, and then return its fields when asked for it. Here is a MWE for it:
# incoming message: b"\xA7\x05\x61" == 3 bytes
#
# fields:
# a == byte 0
# b == byte 1, bits 0-3
# c == byte 1, bits 4-7
# d == byte 2
struct msg_A
payload::Vector{UInt8}
end
function Base.getproperty!(m::msg_A, s::Symbol)
if s == :a
return payload[1]
if s == :b
return (payload[2] >> 4)
if s == :c
return (payload[2] & 0xF)
if s == :d
return payload[3]
end
Is this the correct approach?
Would it be better storing it as an IOBuffer? Or to store it in already separated fields (a, b, c and d)?
Extra question
For some types of messages, such as the one defined above msg_A, the length is fixed. Can this fixed length be incorporated to the struct definition (other than defining a length function for it)? Will it help the compiler/execution knowing how much it occupies in memory?
I’d go with your third option of storing individual fields. It won’t be less efficient but guarantees a fixed size layout & thus helps the compiler generate efficient code. It also preserves type safety.
Julia structs are also compatible with C, layout wise, so you should be able to faithfully represent almost all types.
And regarding the type for storage, what would be the most efficient/adequate type?
As I see it, short fields I could store in one UInt8 each, or UInt32.
But what if the field I want to store and work with has 40 bits? I guess that could fit in a 64-bit UInt. But what about larger bit fields (70/80, etc)?
In those cases it becomes more difficult working with bits, and bitwise/mask operations (&,|,XOR)
That depends on the types in your protocol, but I’d usually go with either directly matching types of equal size or the next-largest type that can fit your elements.
70/80 bits is a very awkward format to receive and sounds like that’s a composite type (i.e. a type made up of multiple smaller, more primitive types). I’d unpack that from its compressed form to individual fields (or even create its own type for unpacking that into), as unaligned accesses to memory are extremely slow and constantly shifting/masking operations are not really conductive to high performance (this does depend on how much work you do on your received data, but it’s fairly straight forward to implement and easy to use).
If there are already C definitions for the protocol, it is easy to use CBinding.jl to automatically generate the representation in Julia. It produces very efficient code for accessing bitfields and such as well. Here is a blog post that demonstrates in-place accessing a WAV file just by “including” the format defined in the libsndfile C headers.