Merging dictionaries ensuring they are disjoint

I am merging some dictionaries that come from a computation and I want to ensure that they are disjoint.

A quick way of doing it is mergewith, eg

D1 = Dict(:a => 1, :b => 2)
D2 = Dict(:a => 4, :c => 3)
mergewith((v...) -> error("multiple values $v for some key"), D1, D2)

but that has the disadvantage that I cannot report the offending key to the user.

I am just curious if there is a way to do this with built-in functions that I missed.

(of course coding this is trivial)
function merge_disjoint(dict1::AbstractDict{K1,V1},
                        dict2::AbstractDict{K2,V2}) where {K1,V1,K2,V2}
    K = promote_type(K1, K2)
    V = promote_type(V1, V2)
    result = Dict{K,V}(dict1)
    for (k, v) in pairs(dict2)
        if haskey(result, k)
            throw(ArgumentError("key $k present in multiple dictionaries, cannot merge"))
        else
            result[k] = v
        end
    end
    result
end
1 Like
mergewith((v...) -> error("duplicated keys $(intersect(keys(D1), keys(D2)))"), D1, D2)
6 Likes

Another route:

using StatsBase

D1 = Dict(:a => 1, :b => 2)
D2 = Dict(:a => 4, :c => 3)

keys(filter(>(1)∘last,addcounts!(countmap(keys(D1)), keys(D2))))

giving:

KeySet for a Dict{Symbol, Int64} with 1 entry. Keys:
  a:

(without the initial keys(...) the amount of repetition can also be obtained)

If there are more dictionaries, then a foldl with addcounts! and an init kwarg set to countmap should be possible.

Note, this method doesn’t short-circuit, so if that is necessary the other methods can work.

UPDATE: Another method which does short-circuit:

let k::keytype(D1), s = Set{keytype(D1)}(), i=0
    for outer k in Iterators.flatten([keys(D1), keys(D2)])
        i += 1
        k in s && break
        push!(s, k)
    end
    i == length(s) ? nothing : k
end

This method tries to be resource efficient, and can work for any number of dictionaries by changing the argument of flatten appropriate. The let’s value is nothing if no repetition or the first repeating element if there is one.

ADDENDUM: This isn’t exactly the OP’s request, as the dictionaries are not merged. So, oops, sorry, but still an interesting problem.

POST ADDENDUM: To fix the issue in the Addendum, here is a method which returns the desired dictionary but also is resource efficient (at this point benchmarking should be done):

let s = false, r = Dict{keytype(D1), valtype(D1)}()
    for e in Iterators.flatten([D1,D2])
        mergewith!((v...)->(s = true; last(v)), r, Base.ImmutableDict(e))
        s && error("duplicate key $e")
    end
    r
end