Design and performance issues with function generator macro

I am having some design and performance issues with a helper macro which creates (binary) parsers for structs.

Let me add a bit of background in case of a totally wrong approach: I need to parse binary data and the structure is often encoded in the data itself, so I often have to create parsers during runtime. More over, everything is big endian so even basic types need special handling.

I have a nice working solution already but currently the design is very rigid and the macro itself holds hardcoded parser logic for different kind of data structures. In future, the list will be very long and needs to be extended from the outside, so Iā€™d definitely want the user to be able to provide parser functions by simply defining them, without touching the actual macro.

What my macro does is taking a struct, evaluating it, and defining a method for the unpack function (which is part of my lib) by analysing their field types.

So basically what it allows is:

@io Foo
    x::Int32
    y::CustomType
end

and then it will define a unpack(io, ::Type(Foo)) which yields an instance of Foo from io by reading and parsing the data accordingly. As said, the way how Int32 and CustomType are parsed is currently hardcoded inside the @io macro.

Let me come to the MWE

For convenience, I use a large buffer (buf) so I can easily run @btime and see how it performs on repeated readings (sometimes it has to be rewinded, or just recreated). Foo is my struct which I want to parse from the data and I now explicitly define function which is otherwise provided by my macro and call it unpack_a:

data = rand(UInt8, 1000000000)
buf = IOBuffer(data)

struct Foo
    x::Int32
    y::Float32
    z::Int64
end

function unpack_a(io, ::Type{Foo})
    Foo(ntoh(read(io, Int32)), ntoh(read(io, Float32)), ntoh(read(io, Int64)))
end

This works fine and gives (to my knowledge) maximum performance (I really tried many different ways and also existing struct-parser libraries and this is by far the fastest way):

julia> @btime unpack_a($buf, Foo)
  11.986 ns (0 allocations: 0 bytes)

Now I want the macro to use a different approach to create the unpack method, so I can use functions which are defined outside.

This is more or less what I am aiming for (unpack_b generated by a macro, based on the fields of Foo). Not sure if @inline is mandatory but it gave similar benchmark results.

@inline unpack_int32(io) = ntoh(read(io, Int32))
@inline unpack_int64(io) = ntoh(read(io, Int64))
@inline unpack_float32(io) = ntoh(read(io, Float64))

function unpack_b(io, ::Type{Foo})
    Foo(unpack_int32(io), unpack_float32(io), unpack_int64(io))
end

here is the benchmark result:

julia> @btime unpack_b($buf, Foo)
  12.032 ns (0 allocations: 0 bytes)

So far so good. Now comes the actual problem :wink:

My over-simplified macro approach is the following:

function unpack_c() end

macro create_unpack_c()
    # in the original code I determine the unpack functions based on the
    # struct definition which is passed to this macro
    # now let's assume it's done and it creates this vector of functions:
    parsers = [unpack_int32, unpack_float32, unpack_int64]
    
    quote
        function $(@__MODULE__).unpack_c(io, ::Type{Foo})
            Foo([parser(io) for parser in $parsers]...)  # this is the line which I struggle with
        end
    end
end

@create_unpack_c

As you see in the comments, I now hardcoded the part where I determine the parser-functions for the types. That is solved. The problem is the line Foo([parser(io) for parser in $parsers]...) which is the only way I could get it working, but the performance is off by an order of magnitude, since I create a closure.

julia> unpack_c($buf, Foo)
  729.947 ns (10 allocations: 368 bytes)

My question is: how do I properly do a loop and bake it into the function definition?

This does the job, note the two extra layers of quoting / interpolation.

function unpack_c() end

macro create_unpack_c()
    # in the original code I determine the unpack functions based on the
    # struct definition which is passed to this macro
    # now let's assume it's done and it creates this vector of functions:
    parsers = [:unpack_int32, :unpack_float32, :unpack_int64]

    quote
        function $(@__MODULE__).unpack_c(io, ::Type{Foo})
            Foo($([:($parser(io)) for parser in parsers]...))  # this is the line which I struggle with
        end
    end
end
2 Likes

Many thanks! I need to wrap my head around these quoting :see_no_evil:

Btw. here is the final macro in case anyone is interested.

function unpack() end

readtype(io, ::Type{T}) where T<:Union{Integer, AbstractFloat} = ntoh(read(io, T))
readtype(io, ::Type{T}) where T<:AbstractVector{UInt8} = read(io, sizeof(T))


macro io(data)
   struct_name = data.args[2]

   types = []
   parametric_types = []
   for f in data.args[3].args
       isa(f, LineNumberNode) && continue
       isa(f, Symbol) && error("Untyped field")
       Meta.isexpr(f, :(::)) || error("")
       push!(types, f.args[2])
   end

   struct_size = sum([sizeof(eval(t)) for t in types])

   quote
       $(esc(data))  # executing the code to create the actual struct
       Base.sizeof(::Type{$(esc(struct_name))}) = $struct_size

       function $(@__MODULE__).unpack(io, ::Type{$(esc(struct_name))})
           $(esc(struct_name))($([:(readtype(io, $t)) for t in types]...))
       end

       nothing  # supress REPL output
   end
end
1 Like