This question is about how to improve iterator performance when kwargs splatting like (; x...)
.
I have information stored in a structure something like this:
struct MyStruct{T1A, T1B, T2A, T2B}
a::NamedTuple{T1A, T1B}
b::NamedTuple{T2A, T2B}
end
There can be some overlap in the values of a
and b
, so for this splat I’d like to prioritize one (but I don’t intend to throw away information or store a merg
ed NamedTuple
). I create some iterator functions, something like this:
Base.getindex(obj::MyStruct{T1A,T1B,T2A,T2B}, k) where {T1A,T1B,T2A,T2B} = getfield(k ∈ T2A ? obj.b : obj.a, k)
Base.keys(obj::MyStruct) = keys(merge(obj.a, obj.b))
Base.values(obj::MyStruct) = (obj[k] for k ∈ keys(obj))
Base.iterate(obj::MyStruct, itr=zip(keys(obj), values(obj))) = Iterators.peel(itr)
Then I test it for speed:
@btime let; [(MyStruct((a=rand(),b=rand()),(c=rand(),d=rand()))...,) for i=1:1000] end;
# args splat 11.500 μs (2 allocations: 62.55 KiB)
@btime let; [(; MyStruct((a=rand(),b=rand()),(c=rand(),d=rand()))...) for i=1:1000] end;
# kwargs splat 850.900 μs (22004 allocations: 1.07 MiB)
It’s currently (much!) faster splatting into args (x...,)
than kwargs (; x...)
, but I want it to be fast splatting into kwargs instead (I don’t care about args splatting because it causes type instability in the successive steps).
It seems as if NamedTuple
s have two different variants of iterators, one that’s called when splatting into args (lines 130 and 139 of namedtuple.jl which yields values without keys), and one called when splatting into kwargs (???), for example:
test = (a=1, b=4, c=5, d=6)
@show (test...,) # (1, 4, 5, 6)
@show (; test...) # (a = 1, b = 4, c = 5, d = 6)
and their performance is comparable:
@btime let; [(merge((a=rand(),b=rand()),(c=rand(),d=rand()))...,) for i=1:1000] end;
# args splat 6.680 μs (2 allocations: 31.30 KiB)
@btime let; [(; merge((a=rand(),b=rand()),(c=rand(),d=rand()))...) for i=1:1000] end;
# kwargs splat 6.720 μs (2 allocations: 31.30 KiB)
but afaik I can only make one variant of Base.iterate
, and its performance splatting into kwargs is horrible.
Am I missing something?