This is helpful but depending on Clang, is big and unnecessary, I understand why they did it. I need a pure julia low level byte and bit level data manipulation without a lot of dependencies. I might write one if it’s not available
Are you perhaps thinking of something like the Seriliazation stdlib or something like serde in Rust? As far as I know, there isn’t really much for that kind of work at the moment, the package ecosystem is still more geared towards numerical work than general software engineering stuff.
I guess there’s also some parser combinator libraries, if that’s what you’re looking for. Parsers.jl comes to mind.
serde has a lot of functionality, that’s a lofty goal for now. For example, I can say UInt32 :: UInt32 :: UInt32 that should create a parser that can serialize and deserialize bytes 3 Uint32’s little endian for example.
On this note, is there Java’s ByteBuffer like abstraction in Julia?
From just looking at scodec, it looks like you wouldn’t need a library for most functionality…
Julia’s immutable structs (with no mutable structs in them, or isbitstype(T)) are already doing mostly what scodec codecs are doing:
firscodec = Tuple{UInt8,UInt8,UInt16}
bytes = hex2bytes("102a03ff")
# ntoh to swap to endian of the current system
result = (ntoh.(reinterpret(firscodec, bytes)[1]))
Int(sum(result))
struct Point
x::Int
y::Int
z::Int
end
# Many ways to convert the result to a Point struct
# Might want some utility, or simply a good coverage of `convert(MyType, x)`
point = Point(result...)
# I guess one could have this convenience:
interpret_as(bytes, ::Type{T}, ::Type{Codec}) where {T, Codec} = convert.(T, reinterpret(Codec, bytes))
io = IOBuffer() # I guess similary to Java ByteBuffer
write(io, Ref(point))
seekstart(io)
point2 = Ref{Point}()
point2 = read!(io, point2)
@test point2[] == point
io = IOBuffer()
write(io, bswap(0x102a03ff))
bytes = take!(io)
ntoh.(reinterpret(firscodec, bytes)[1]) == result
You can likely make all of this a bit more elegant here and there, but those operations should work pretty well and should be close to C performance in julia.
thanks this is very helpful. I am just writing simple reusable components that you can combine to make a byte level parser.
My idea is create codec for every type (using macros and generated), and create a DSL that can compile a parser to a tuple. (I am still thinking about using heterogenous tuples vs structs)
Just a tip from me after developing such code for quite a few years in Julia: I’d keep at as simple and function based as possible before doing anything more complicated.
I’ve regretted any macro and generated function, that I was able to avoid after understanding the actual problem I want to solve better
thanks for the advice, what’s the best way to emulate HLists (heterogenous lists), tuples are hard to metaprogram.
I am using tuples here, we can also use Vector{Codec}
This is what I got so far
abstract type Codec{T} end
struct IntCodec{T<:Integer} <: Codec{T}
data::T
end
struct FloatCodec{T<:AbstractFloat} <: Codec{T}
data::T
end
struct TupleCodec{T<:Tuple{Vararg{Codec}}} <: Codec{T}
data::T
end
# convert
convert(::Type{T}, codec::IntCodec{T}) where {T<:Integer} = codec.data
convert(::Type{T}, codec::FloatCodec{T}) where {T<:AbstractFloat} = codec.data
convert(::Type{T}, codec::TupleCodec{T}) where {T<:Tuple{Vararg{Codec}}} = codec.data
function Base.read(io::IO, ::Type{T}) where {T<:Tuple{Vararg{Codec}}}
codec_types = T.parameters
elements = map(codec_types) do C
read(io, C)
end
return TupleCodec(Tuple(elements))
end
function Base.read(io::IO, ::Type{IntCodec{T}}) where {T<:Integer}
value = read(io, T)
return IntCodec{T}(value)
end
# create a codec for every type
@generated function decode(io::IO, ::Type{C}) where {C<:Codec}
T = C.parameters[1]
codec_type = C.name.wrapper
if T <: Integer
type_name = "integers"
elseif T <: AbstractFloat
type_name = "floats"
elseif T <: Tuple{Vararg{Codec}}
type_name = "tuple"
else
error("Unsupported type for decode: $T")
end
quote
if eof(io) || (io.size - io.ptr + 1) < sizeof($T)
throw(ArgumentError("Not enough bytes for type $(sizeof($T))-byte $($type_name)"))
end
res = read(io, $T)
return res
end
end