Non-inplace `map` for Dictionaries

simeonschaub · April 1, 2019, 10:37am

I am currently implementing a custom Dict type, which allows for different key and value types. This means that iterate is not type stable, but many operations like getindex, merge or map can be implemented to be type stable. I was looking for a way to implement map only to find that it isn’t implemented for dicts anymore. I stumbled upon the discussion on this PR where it was decided against adding a non-inplace map for dicts. @stevengj brought up the following argument:

I think the current semantics of map(f, values(dict)) producing a new array of values is fine and is what I would expect from mapping the values iterator, consistent with other iterators. We already have a non-mutating dictionary map via Dict(k => f(v) for (k,v) in d) , and you could also do map!(f, values(copy(dict))) to avoid re-hashing the keys, so I’m not sure there is a need for a new function.

In contrast, map!(f, values(dict)) , which is currently an error, has only one possible reasonable meaning to me: modify the values of dict in-place.

The main value of this function is as a performance optimization in the in-place case, anyway; in a typical context where you are willing to make a copy of the dictionary, I’m guessing that the performance cost of re-hashing the keys is not such a big deal.

Both alternatives, Dict(k => f(v) for (k,v) in d) and map!(f, values(copy(dict))) rely on iterate and therefore don’t make sense for certain AbstractDicts like mine. I would also argue that something along the lines of map(f, dict) is much more concise than both alternatives and I think many users would expect that to work since NamedTuples already work that way. It would also make it easier to write generic code for different types of collections.
I can’t really see the benefit in not having such a function. One might still want to discuss, whether the function should have a different name and whether f should take in a Pair of key and value or just the value. Those are just my thoughts and use case though, feel free to discuss.
Thank you for making Julia so awesome!

foobar_lv2 · April 1, 2019, 12:55pm

I don’t understand what your problem with map!(f, values(copy(dict))) is.

It is assumed that AbstractDict supports iteration. If your data type cannot support iteration, then you should consider choosing a different abstract supertype (most likely candidate: none at all). Of course it is sometimes necessary to be pragmatic, and break assumptions.

Can you give a more concrete description of your datastructure and use-case?

simeonschaub · April 1, 2019, 2:20pm

My dict basically consists of a tuple of different AbstractDicts, all with different key types that can have different value types. At its heart, my implementation looks like this:

struct MixedKeyDict{T<:Tuple} <: AbstractDict{Any,Any}
    dicts::T
end

Base.length(d::MixedKeyDict) = sum(length, d.dicts)

function Base.iterate(d::MixedKeyDict, state=(1,))
    index = first(state)
    res = iterate(d.dicts[index], Base.tail(state)...)
    if res == nothing
        if index < length(d.dicts)
            return iterate(d, (index+1,))
        else
            return nothing
        end
    else 
        return first(res), (index, Base.tail(res)...)
    end
end

Base.getindex(d::MixedKeyDict, key) = _getindex(d.dicts, key)

_getindex((d,)::Tuple{D,Vararg}, key::K) where {K,D<:AbstractDict{K}} = d[key]
_getindex(dicts, key) = _getindex(Base.tail(dicts), key)
_getindex(::Tuple{}, key) = throw(KeyError(key))

As you can see, it’s not possible to make iterate type stable, but map, for example could simply be written like this:

Base.map(f, d::MixedKeyDict) = MixedKeyDict(map.(f, d.dicts))

ndinsmore · April 1, 2019, 3:55pm

The general idea is that if someone implements an abstractdict then have two options, make sure the keys a values iterators work then the niave implimentation of map! in abstract dict should work.

If they want to have a more performant version they can implement their own implementation of map!. In your case I think that would look like:

function Base.map!(f,iter::Base.ValueIterator{<:MixedKeyDict})
    for d in iter.dict.dicts
          map!(f,values(d))
    end
    return iter
end

(I didn’t check that code)

That should be just as fast as map! for a normal Dict

Topic		Replies	Views
Performance issues when working with dict Performance dictionary	11	1691	November 16, 2022
Custom dictionary type: fieldless keytype, fieldless valtype General Usage dictionary , dictionaries	3	73	April 25, 2025
Small <: AbstractDict implementation General Usage question	9	1120	February 13, 2019
RFC: Reference Iterator for better in place loop modifications Internals & Design	4	490	March 25, 2019
Is order well-defined for multiple iterations over `Dict`? General Usage question	5	917	February 21, 2017

Non-inplace `map` for Dictionaries

Related topics