Non-inplace `map` for Dictionaries

I am currently implementing a custom Dict type, which allows for different key and value types. This means that iterate is not type stable, but many operations like getindex, merge or map can be implemented to be type stable. I was looking for a way to implement map only to find that it isn’t implemented for dicts anymore. I stumbled upon the discussion on this PR where it was decided against adding a non-inplace map for dicts. @stevengj brought up the following argument:

I think the current semantics of map(f, values(dict)) producing a new array of values is fine and is what I would expect from mapping the values iterator, consistent with other iterators. We already have a non-mutating dictionary map via Dict(k => f(v) for (k,v) in d) , and you could also do map!(f, values(copy(dict))) to avoid re-hashing the keys, so I’m not sure there is a need for a new function.

In contrast, map!(f, values(dict)) , which is currently an error, has only one possible reasonable meaning to me: modify the values of dict in-place.

The main value of this function is as a performance optimization in the in-place case, anyway; in a typical context where you are willing to make a copy of the dictionary, I’m guessing that the performance cost of re-hashing the keys is not such a big deal.

Both alternatives, Dict(k => f(v) for (k,v) in d) and map!(f, values(copy(dict))) rely on iterate and therefore don’t make sense for certain AbstractDicts like mine. I would also argue that something along the lines of map(f, dict) is much more concise than both alternatives and I think many users would expect that to work since NamedTuples already work that way. It would also make it easier to write generic code for different types of collections.
I can’t really see the benefit in not having such a function. One might still want to discuss, whether the function should have a different name and whether f should take in a Pair of key and value or just the value. Those are just my thoughts and use case though, feel free to discuss.
Thank you for making Julia so awesome!

1 Like

I don’t understand what your problem with map!(f, values(copy(dict))) is.

It is assumed that AbstractDict supports iteration. If your data type cannot support iteration, then you should consider choosing a different abstract supertype (most likely candidate: none at all). Of course it is sometimes necessary to be pragmatic, and break assumptions.

Can you give a more concrete description of your datastructure and use-case?

1 Like

My dict basically consists of a tuple of different AbstractDicts, all with different key types that can have different value types. At its heart, my implementation looks like this:

struct MixedKeyDict{T<:Tuple} <: AbstractDict{Any,Any}
    dicts::T
end

Base.length(d::MixedKeyDict) = sum(length, d.dicts)

function Base.iterate(d::MixedKeyDict, state=(1,))
    index = first(state)
    res = iterate(d.dicts[index], Base.tail(state)...)
    if res == nothing
        if index < length(d.dicts)
            return iterate(d, (index+1,))
        else
            return nothing
        end
    else 
        return first(res), (index, Base.tail(res)...)
    end
end

Base.getindex(d::MixedKeyDict, key) = _getindex(d.dicts, key)

_getindex((d,)::Tuple{D,Vararg}, key::K) where {K,D<:AbstractDict{K}} = d[key]
_getindex(dicts, key) = _getindex(Base.tail(dicts), key)
_getindex(::Tuple{}, key) = throw(KeyError(key))

As you can see, it’s not possible to make iterate type stable, but map, for example could simply be written like this:

Base.map(f, d::MixedKeyDict) = MixedKeyDict(map.(f, d.dicts))

The general idea is that if someone implements an abstractdict then have two options, make sure the keys a values iterators work then the niave implimentation of map! in abstract dict should work.

If they want to have a more performant version they can implement their own implementation of map!. In your case I think that would look like:

function Base.map!(f,iter::Base.ValueIterator{<:MixedKeyDict})
    for d in iter.dict.dicts
          map!(f,values(d))
    end
    return iter
end

(I didn’t check that code)

That should be just as fast as map! for a normal Dict