Dict: getindex and setindex in one function call

There is a common pattern for working with Dict which is to look up the value and modify it in some way. DataStructures.jl’s Accumulator uses the value as an accumulator to keep sum, StatsBase.countmap uses it to do a frequency count and there are implementation in FreqTable.jl and SplitApplyCombine.jl

The slow way to do this is

#let dict be a Dict{T,S}
szero = zero(S)
dict[key] = some_function(get(dict, key, szero))

In here the lookup happens twice one for dict[key] and one for get(dict, key, szero). This method is used in DataStructures.jl accumulator, another method is used by groupreduce in SplitApplyCombine which uses Base.ht_keyindex2 to avoid two lookups and the speed difference is huge in favor of the latter.

Given the commonality of this pattern, I think it’s worthwhile to have a function called getindexmodify that can get an index and apply a function to its value to update it in one lookup.

See https://github.com/JuliaLang/julia/issues/15630 and specifically the suggestion in https://github.com/JuliaLang/julia/issues/15630#issuecomment-201855260

1 Like

There’s also the idea of tokens, I’m kind of wondering if Associative should have an interface that supports:

  • Getting a token from a key
  • Getting the key from a token (probably fast?)
  • Getting and setting the value for a token (fast)

And if you think about it, linear indexing of arrays is quite a bit like a token system…

2 Likes

Note that SortedDicts and friends in DataStructures.jl use this notion of
tokens.

Unfortunately in a way that conflicts with the standard API for Associative, so it would be nice if we had an official API for this so that those could be brought into agreement finally. Something like keytoken and then index gettoken and settoken! might work. Worth opening an issue!

Issue opened!
https://github.com/JuliaLang/julia/issues/24454

Cheers, Kevin

1 Like