Hash of Dict with custom type as keys

I’m having an issue when trying to use hashes of Dict with custom type as keys. Here’s a minimal example that shows my problem:

struct My
	x::Vector{Int}
end
Base.hash(m::My)=Base.hash(m.x)	
Base.:(==)(m::My,n::My)=m.x==n.x
a=My([1,2,3])
b=My([1,2,3])
println(hash(a)==hash(b))
u=Dict(a=>1)
v=Dict(b=>1)
println(u==v)
println(hash(u)==hash(v))

Running this (using Julia 1.5.3) one gets that a and b have the same hash, that u and v are equal as they should but that u and v do not have the same hash, which is really confusing me. Note that this does not happen if I define a and b as just Vectors of Int rather than using a custom type. Does anyone knows why this happens and how to fix this ?

You have to implement the two-arg version of hash, as per doc-string:

Base.hash(m::My, h::UInt)=Base.hash(m.x, h)

Not sure where it goes wrong otherwise.

See also https://github.com/andrewcooke/AutoHashEquals.jl

1 Like

Thanks ! It does fix the problem, I had seen this second argument mentioned in the doc but to be honest did not quite get what it was doing…

when you hash a composite structure, for example, A.a, A.b, it will have to mix the two together:

h1 = hash(A.a, 0) #initial hash
h2 = hash(A.b, h1)

the second argument defaults to 0 when start with hashing the object.

Note that if you don’t mix in the name of the new type with the hash, you could end up with a situation where different types return the same hash:

struct A
    x::Vector{Int}
end

struct B
    x::Vector{Int}
end

Base.hash(a::A, h::UInt) = hash(a.x, h)
Base.hash(b::B, h::UInt) = hash(b.x, h)
julia> a = A([1, 2, 3]);

julia> b = B([1, 2, 3]);

julia> hash(a) == hash(b)
true

To avoid this, you can mix the type into the hash:

Base.hash(a::A, h::UInt) = hash(A, hash(a.x, h))
Base.hash(b::B, h::UInt) = hash(B, hash(b.x, h))
julia> a = A([1, 2, 3]);

julia> b = B([1, 2, 3]);

julia> hash(a) == hash(b)
false

Above I used the actual type objects A and B in the hash. You could use symbols like :A and :B instead. I’m not sure if there are any reasons to prefer symbols over type objects, or vice versa, when constructing the hash.

1 Like