Filtering keys out of named tuples

ChrisRackauckas · December 23, 2021, 11:57pm

What’s the right way to filter a key out of a named tuple? Example:

nt = (;a = 1, b = 2)
filter(x->x[1] == :a,nt) # Errors
filter(x->x[1] == :a,Base.Pairs(nt,keys(nt))) # Allocates an entire dictionary

What’s the right way to do this and get another NamedTuple back out instead of going through a dictionary and losing all inference/performance? Does this have an issue in Base already?

Henrique_Becker · December 24, 2021, 12:30am

I do believe the following is faster, can you benchmark it for your use case and give some feedback?

julia> filter_nt_fields(f, nt) = NamedTuple{filter(f, keys(nt))}(nt)
filter_nt_fields (generic function with 1 method)

julia> nt = (;a = 1, b = 2, c = 3)
(a = 1, b = 2, c = 3)

julia> filter_nt_fields(!=(:b), nt)
(a = 1, c = 3)

Note, however, that this is an inherently type-unstable operation.

ChrisRackauckas · December 24, 2021, 12:37am

Not with constant prop. You should be able with constant prop to know takeout(nt,(:b,)) and know that it will be (:a,:c) left. We just don’t have a method for it . So yeah, maybe filter isn’t the right solution but that means there’s a missing verb.

Henrique_Becker · December 24, 2021, 1:30am

The keys method for NamedTuple return a Tuple and the filter method for Tuple acts directly over the Tuple (do not collect in a Vector) and return a Tuple, so theoretically, there the potential for constant propagation all the way through, did you try to see if it happens in your use case?

mcabbott · December 24, 2021, 2:48am

You can do tail((; b=nothing, nt...,)), which seems fast:

julia> nt = (a=1, b=2, c=3);

julia> @inline takeout(kill::Symbol, nt::NamedTuple) = Base.tail(merge(NamedTuple{(kill,)}((nothing,)), nt));

julia> @btime takeout(:b, $nt)
  min 0.875 ns, mean 0.946 ns (0 allocations)
(a = 1, c = 3)

(This will break if nt doesn’t have :b, of course.)

aplavin · December 24, 2021, 6:18am

Does this help, or you need more general filtering with an arbitrary predicate?

julia> nt = (;a = 1, b = 2)
(a = 1, b = 2)

julia> nt[(:a,)]
(a = 1,)

julia> Base.structdiff(nt, NamedTuple{(:a,)})
(b = 2,)

julia> using Accessors
julia> @delete nt.a
(b = 2,)

julia> using InvertedIndices
julia> nt[Not(:a)]  # if/when https://github.com/JuliaData/InvertedIndices.jl/pull/29 lands
(b = 2,)

rocco_sprmnt21 · December 25, 2021, 9:58pm

julia> @btime nt_b=NamedTuple{filter(!=(:b), keys($nt))}($nt)
  0.001 ns (0 allocations: 0 bytes)
(a = 1, c = 3)

julia> @btime $nt[filter(!=(:b), keys($nt))]
  0.001 ns (0 allocations: 0 bytes)
(a = 1, c = 3)

JeffreySarnoff · December 25, 2021, 10:54pm

Much of the benchmarking above presumes you know ahead of time the one symbol that want to remove, and you always want to remove that same symbol. The timings and allocations increase when both the named tuple and the symbol are not fixed. You may want to remove more than one key. Base uses Base.structdiff for this kind of manipulation. One approach:

drop(nt::NamedTuple, key::Symbol) = 
    Base.structdiff(nt, NamedTuple{(key,)})
drop(nt:: NamedTuple, keys::NTuple{N,Symbol}) where {N} =
    Base.structdiff(nt, NamedTuple{keys})

When you do know which symbol[s] are to be dropped from the keys, it is much more performant to work with function[s] that use that information immediately. When the same field[s] is/are removed in different parts of the source code, you may prefer to specialize the names.

drop_baz(nt) = Base.structdiff(nt, NamedTuple{(:baz,)})
drop_b_c(nt) = Base.structdiff(nt, NamedTuple{(:b, :c)})

Macroizing that process makes sense.

aplavin · December 26, 2021, 6:51am

Performance of these variants is exactly the same under constant propagation, so there is no need to define functions separately for each key name:

julia> nt = (a=nothing, b=2, c='a')

julia> @btime Base.structdiff($nt, NamedTuple{(:a,)})
  1.589 ns (0 allocations: 0 bytes)
(b = 2, c = 'a')

# your drop() function, with @inline:
julia> @inline drop(nt::NamedTuple, key::Symbol) = 
           Base.structdiff(nt, NamedTuple{(key,)})
julia> @btime drop($nt, :a)
  1.587 ns (0 allocations: 0 bytes)
(b = 2, c = 'a')

# implementation in an existing package:
julia> using Accessors
julia> @btime @delete $nt.a
  1.586 ns (0 allocations: 0 bytes)
(b = 2, c = 'a')

JeffreySarnoff · December 26, 2021, 7:29pm

Yes. I was aware of the timing information; imo it is better practice to wrap uses of nonexported Base functions inside a user function, which may be inlined.

The multiple naming was to make clear the different specific intents, not to imply that need be done. I have modified the comment to clarify.

Topic		Replies	Views
`filter` a `NamedTuple` based on one of the fields New to Julia question	8	2778	March 21, 2019
Remove a field from a `NamedTuple` General Usage	10	4033	August 14, 2024
How to "delete" a key from NamedTuple New to Julia	3	1031	April 25, 2020
Search nested namedtuples by key General Usage namedtuple , functors , fmap	5	696	July 18, 2022
Map and NamedTuple Internals & Design question	2	963	October 30, 2018

Filtering keys out of named tuples

Related topics