Allocations when using getfield with a tuple/vector of symbols

MWE:

using Parameters, BenchmarkTools
@with_kw struct Test
    A :: String = "A"
    B :: Int64  = 0
end

test = Test()
tup  = (:A, :B)

@btime getfield($test, :A)
@btime getfield($test, $tup[1])

Is there a way to loop over struct fields without these allocations?

julia> @btime getfield($test, :A)
  2.400 ns (0 allocations: 0 bytes)
"A"

julia> @btime getfield($test, $tup[1])
  23.002 ns (1 allocation: 32 bytes)
"A"

You could use generated function:

@generated function getfield_unrolled(t::T, f::Symbol) where {T}
    names = fieldnames(T)
    exprs = [:($(QuoteNode(name)) == f && return getfield(t, $(QuoteNode(name)))) for (c, name) in enumerate(names)]

    push!(exprs, :(throw(ErrorException("type $T has no field $f"))))
    return quote
        $(exprs...)
    end
end

and get

julia> @btime getfield_unrolled($test, $(tup[1]))
  1.855 ns (0 allocations: 0 bytes)
"A"
3 Likes

And finally I know a good example showing that the allocation can be for the runtime dispatch alone, not the uncertain return type. Still no idea what that allocation is doing though. EDIT: actually getfield is a builtin function so I’m not even sure if it runtime dispatches the way generic functions do.

1 Like

Thanks, I’ve been trying to avoid metaprogramming for this project and manually unrolling such loops, but it can be repetitive/tedious. Maybe the warnings against metaprogramming are overblown?

It seems like a lot of trouble for a seemingly simple task.

No. Metaprogramming is appropriate for this task.

There is fundamentally no way of making field access fast if the field name is not known at compile time. In other words, you must make sure that the field name is known at compile time. This is fundamentally metaprogramming.

There are three approaches:

  1. Constant propagation: getfield(x, :A) and similar.
  2. Lifting to type domain: myGetfield(x, ::Val{sym}) where sym = getfield(x, sym)
  3. explicit metaprogramming

Lifting to type domain is a way of explicitly encouraging constant propagation at various points.

The issue with naively relying on const-prop is that you are at the mercy of unstable compiler heuristics. It forces all readers of your code to understand how const-prop and inlining works in all julia versions you support and figure out in their head whether that applies to your construction. This is terrible for maintainability. Better go for metaprogramming if it is not exceedingly obvious that the field-names are known at compile time.

For that, you must benchmark differently: Don’t benchmark “the small function you want to measure”. For tiny functions, it’s all about interaction with context (surrounding code), so you must write a realistic outer loop function that calls your tiny function, and benchmark that. Benchmarking / performance is not composable for small timings.

2 Likes

For this case the field names are known ahead of time. I know exactly which elements will be extracted.

Ie, this works fine:

a = test.A
b = test.B

But doing this for, eg, 10 fields becomes repetitive.

So why not implement tuple unpacking such that you can write a,b = Test? For that you need to extend Base.indexed_iterate for your type, either by hand or by metaprogramming a la @with_kw.

1 Like

Couldn’t you just do

(;a, b) = my_object

without implementing any additional methods?

If I understand correctly, those suggestions help with the MWE but the real code would be just as verbose as if I manually unroll it.

Eg, I need to add something (from another struct I am looping the same way) to a and store it, then b, etc. My purpose is to avoid repetitive code without sacrificing performance.

Generally, for looping over fields/properties, one would extract them as a namedtuple – then, stuff like map works and works performantly.
For extraction, use ConstructionBase.jl: it has getfields(x)::NamedTuple and getproperties(x)::NamedTuple.

It may also help if you show some specific examples of what you are trying to achieve.

1 Like

Sure, here is one of the structs in question:

@with_kw struct Roster{T1 <: SVector{MAXPLAYERS, String15}, 
                       T2 <: SVector{MAXPLAYERS, Int16}}
    # Player Info
    Name :: T1 = @SVector fill(String15(""), MAXPLAYERS)
    Age  :: T1 = @SVector fill(String15(""), MAXPLAYERS)
    Nat  :: T1 = @SVector fill(String15(""), MAXPLAYERS)
    Prs  :: T1 = @SVector fill(String15(""), MAXPLAYERS)

    # Ratings
    St   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Tk   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Ps   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Sh   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Sm   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Ag   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)

    # Abilities
    KAb  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    TAb  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    PAb  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    SAb  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)

    # Stats
    Gam  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Sav  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Ktk  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Kps  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Sht  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Gls  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Ass  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    DP   :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Inj  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Sus  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
    Fit  :: T2 = @SVector fill(Int16(0), MAXPLAYERS)
end

And one of the functions (using the solution recommended above):

function calc_metric(baseline, sims)
    nteams    = length(baseline.lg)
    nreps     = length(sims)
    pl_fields = (:Gam, :Sav, :Ktk, :Kps, :Sht, :Gls, :Ass, :DP)
    tm_fields = (:Pl, :W, :D, :L, :GF, :GA, :GD, :Pts)

    sumSq = 0
    for sim in sims
        # Player Stats
        for i in eachindex(pl_fields)
            for j in eachindex(baseline.lg)
                x = getfield_unroll(baseline.lg[j].roster, pl_fields[i])
                y = getfield_unroll(sim.lg[j].roster,      pl_fields[i])
                sumSq += sum((Int64.(x - y)).^2)
            end
        end

        # Team Stats
        for i in eachindex(tm_fields)
            for j in eachindex(baseline.lg_table)
                x = getfield_unroll(baseline.lg_table[j], tm_fields[i])
                y = getfield_unroll(sim.lg_table[j],      tm_fields[i])
                sumSq += sum((Int64.(x - y))^2)
            end
        end

    end

    # Refers to RMSE per team (not total number of variables)
    RMSE = sqrt(sumSq/(nteams*nreps))

    return RMSE
end

Perhaps old-fashioned broadcasting will get similar results:
(couldn’t actually test this, there may be tweaking needed because values are Arrays)

using LinearAlgebra  # to use `dot`

# Player stats
# broadcasting over pl_fields, so no second `for`
for j in eachindex(baseline.lg)
    x = Int64.(getfield.(Ref(baseline.lg[j].roster), pl_fields))
    y = Int64.(getfield.(Ref(sim.lg[j].roster, pl_fields))
    sumSq += sum(t -> dot(t,t), x .- y)
end

(the other loop can be modified in a similar way)

1 Like

I feel something like

pl_type = NamedTuple{(:Gam, :Sav, :Ktk, :Kps, :Sht, :Gls, :Ass, :DP)}
...
p1 = getproperties(baseline.lg[j].roster)
p2 = getproperties(sim.lg[j].roster)
sumSq += map(pl_type(p1), pl_type(p2)) do x, y
    abs2(x - y)
end |> sum
...

should be quite performant, and more readable.
Would be easier to check with an MWE of course :slight_smile:

Also, you may want to consider using a StructArray instead of your Roster struct. Or, if you do want to dispatch on ::Roster, to have this struct only with a data::StructArray field.

1 Like