@generated function - iterate over function argument

Hi there,

I would like to learn a bit more about generated functions. My goal is to convert a struct into a tuple. I already achieved that if I want to convert all fields

struct Teststruct{A,B,C}
    a   :: A
    b   :: B
    c   :: C
end


@generated function to_tuple_generated(x)
    tup = Expr(:tuple)
    for i in 1:fieldcount(x)
        push!(tup.args, :(getfield(x, $i)) )
    end
    return :($tup)
end

te = Teststruct(1., 2., 3)
to_tuple_generated(te) #(1.00, 2.00, 3)

Now, how would I proceed if I only want a specific subset of the fields, say .a and .c in the tuple? I get that the @generated function only sees the input type, so I tried to make it working with :($argument ), but did not succeed. My unsuccessful approach:

#idx are all keys/indices that should be returned as 
@generated function to_tuple_generated(x, idx)
    tup = Expr(:tuple)
    for i in :($idx) #Here only Type{idx} is visible and function errors
        push!(tup.args, :(getfield(x, $i)) )
    end
    return :($tup)
end
idx = [1, 3] #Create tuple from first and third field element
idx2 = (:a, :c) #Create tuple from first and third field element 
to_tuple_generated(te, idx) #should be (1., 3)
to_tuple_generated(te, idx2) #should be (1., 3)

How about this? It seems to pass your tests, although I’m sure a more serious Julia user would probably be unhappy with the amount of type system abuse here.

getval(::Type{Val{T}}) where{T} = T
@generated function to_tuple_generated(x, vals...)
    tup = Expr(:tuple)
    for v in vals 
      push!(tup.args, :(getfield(x, getval($v))))
    end
    return :($tup)
end

which you then call with

to_tuple_generated(te, Val(1), Val(3))
to_tuple_generated(te, Val(:a), Val(:c))
1 Like

Thank you! It is quite a bit faster than a runtime version like this:

to_tuple_naive(container, fld ) = Tuple( getfield(container, v) for v in fld  )
sym = ( Val(:a), Val(:c) )
to_tuple_naive(te, (:a, :c) )
to_tuple_generated(te, sym... ) #(1.00, 2.00, 3)

using BenchmarkTools
@btime to_tuple_naive($te, $(:a, :c) )      # 241.007 ns (7 allocations: 320 bytes)
@btime to_tuple_generated($te, $sym... )    # 0.001 ns (0 allocations: 0 bytes)

I wanted to adjust this to also subset NamedTuples, the benchmark for this is:

nmdtuple = (a = 1., b = [2., 3.], c = [4. 5. ; 6. 7.])
sbset = (:a, :b)
@inline subset(nt::NamedTuple, s::Tuple{Vararg{Symbol}}) = NamedTuple{s}(nt)

subset(nmdtuple, sbset)
@btime subset($nmdtuple, $sbset) #525.263 ns (6 allocations: 256 bytes)

I can run a function with all Tuple keys, but I do not know how I would subset this, because I dont know how the compiler would know the field types of my generic NamedTuples? Non-running MWE:


getval(::Type{Val{T}}) where{T} = T
generate_named_tuple(container, fields...) = NamedTuple{( fields,), Tuple{ (fieldtype(container, getval(i) ) for i in fields )} }
@generated function subset_named_tuple_generated(x::NamedTuple, vals...)
    nt  = Expr(:quote, generate_named_tuple(x, vals...) ) #here lies the problem
    tup = Expr(:tuple)
    for v in vals
        push!(tup.args, :( getfield(x, getval($v) ) ) )
    end
    return :($nt($tup))
end
subset_named_tuple_generated(nmdtuple, Val.( sbset )... )           # ArgumentError: Wrong number of arguments to named tuple constructor.

The problem arises as generate_named_tuple(x, vals…) does not infer the type if I input Val.() into the function.

Yeah, that speedup is nice. But after playing around a bit, it looks like you can get that working without any generated functions:

@inline to_tuple(cont, fld) = ntuple(j->getfield(cont, fld[j]), length(fld))
fls = (:a, :c)
@btime to_tuple($te, $fls) # sub 1ns

And these kind of games actually extend to named tuples as well:

using NamedTupleTools
to_nt(x, names) = namedtuple(names, to_tuple(x, names))
_x = (a=1.0, b=1//2, c=1)
@btime to_nt($_x, $fls) # sub 1ns

This uses the NamedTupleTools package, although the implementation used for that constructor is just a few lines of code, so you don’t really need that dependency. I mostly use it here because it occurs to me that that package would probably be a nice thing to look at in general to study efficient methods for tuples and named tuples. And because that package rocks and always deserves a shoutout.

I think in general aggressive use of generated functions is discouraged. I can’t find the reference,but I’ve seen a thread where Keno Fischer gave some exposition about how they are in some sense an escape hatch from the compiler that can make trouble in subtle ways. But with that said, I don’t think anybody would disagree that they are sometimes useful, and it’s definitely nice to understand how they do work so that you can comfortably use them when they really will help, so I don’t mean to be discouraging. Just to point out that you can play games with first, last, ntuple, and other functions like that to get better inference without dropping down to generated functions.

EDIT: sorry, lazy transferring of to_nt from REPL that was initially incorrect.

1 Like

ConstructionBase.jl getproperties does this with a generated function:

https://github.com/JuliaObjects/ConstructionBase.jl/blob/2044dd59b61c701b66ab43fc4b4326573c126095/src/ConstructionBase.jl#L46-L53

Flatten.jl can choose a specific subset of fields based on types or FieldMetadata.jl tags. However, it doesn’t return a named tuple as it gets a tuple from nested objects that may have the same field names. So it returns a Tuple. But the code is kinda hard to understand…
https://github.com/rafaqz/Flatten.jl/blob/master/src/Flatten.jl#L105-L133

2 Likes

Thanks a lot for your answers and for bringing up NamedTupleTools.jl! It seems like none of the function allocates with just scalars anyway, so I tried to benchmark it with arrays as well:

tup = (a = 1., b = [2., 3.], c = [ 4. 5. ; 6. 7.], d = [ [8., 9.], [10., 11.] ], e = 12.)
sym = (:a, :b, :c, :d)

#1 Benchmark function
@inline subset(nt::NamedTuple, s::Tuple{Vararg{Symbol}}) = NamedTuple{s}(nt)
subset(tup, sym)

#2 NamedTupleTools
using NamedTupleTools
@inline to_tuple(cont, fld) = ntuple(j->getfield(cont, fld[j]), length(fld))
fls = (:a, :c)
to_nt(x, names) = namedtuple(names, to_tuple(x, names))
to_tuple(tup, sym)
to_nt(tup, sym)


# Benchmark functions
using BenchmarkTools
@btime to_tuple($tup, $sym) # 119.541 ns (6 allocations: 256 bytes)

@btime subset($tup, $sym)   # 505.208 ns (6 allocations: 304 bytes)
@btime to_nt($tup, $sym)    # 980.000 ns (15 allocations: 768 bytes)

It seems like the standard solution is quite a bit faster in this case.

1 Like

Thank you! That’s very similar to the first example, and performs reallly well:

#3 generated
@generated function getproperties(obj)
    fnames = fieldnames(obj)
    fvals = map(fnames) do fname
        Expr(:call, :getproperty, :obj, QuoteNode(fname))
    end
    fvals = Expr(:tuple, fvals...)
    :(NamedTuple{$fnames}($fvals))
end

@btime getproperties($tup) 2.300 ns (0 allocations: 0 bytes)

Is there an equivalent for fnames = fieldnames(obj) for just a Tuple of symbols? I tried to make it visible via :($sym) but never got it to work properly.

@generated function getproperties(obj, sym)
    fnames = fieldnames(sym)

    fvals = map( fnames ) do fname
        Expr(:call, :getproperty, :obj, QuoteNode(fname) )
    end
    fvals = Expr(:tuple, fvals...)
    :(NamedTuple{$fnames}($fvals))
end
tup = (a = 1., b = [2., 3.], c = [ 4. 5. ; 6. 7.], d = [ [8., 9.], [10., 11.] ], e = 12.)
sym = (:a, :b, :c, :d)
getproperties(tup, sym) #MethodError: no method matching iterate(::Type{NTuple{4, Symbol}})

The package code is just this:

getproperties(o::NamedTuple) = o
getproperties(o::Tuple) = o

Because you can’t get a named tuple from a tuple in s straightforward way, the fieldnames are numbers:

julia> fieldnames(typeof((:a, :b, :c)))
(1, 2, 3)

And remember, the symbols in your Tuple sym are runtime objects not visible to the generated function - they’re all just Symbol.

1 Like

Right, a somehow working solution is to just create a smaller NamedTuple for the fields that you want to subset with, instead of the tuple of symbols:


@generated function getproperties(obj, sym)
    fnames = fieldnames(sym)

    fvals = map( fnames ) do fname
        Expr(:call, :getproperty, :obj, QuoteNode(fname) )
    end
    fvals = Expr(:tuple, fvals...)
    :(NamedTuple{$fnames}($fvals))
end
tup = (a = 1., b = [2., 3.], c = [ 4. 5. ; 6. 7.], d = [ [8., 9.], [10., 11.] ], e = 12.)
sym = (:a, :b, :c, :d)
sym2 = (a = true, b = true, c = true, d = true)

subset(tup, sym)
getproperties(tup, sym2)

@btime subset($tup, $sym)           # 496.354 ns (6 allocations: 304 bytes)
@btime getproperties($tup, $sym2)   # 2.100 ns (0 allocations: 0 bytes)

Yep that works. Probably ConstructionBase.jl should have that getproperties method with the NamedTuple patch argument (your sym), to mirror setproperties.