Should custom hash functions distinguish types?

greatpet · December 1, 2023, 10:22am

Suppose my package define two struct types A and B, each containing just a single field called data. Then I overload Base.hash as follows:

import Base: hash

hash(a::A, i) = hash(a.data, i)

hash(b::B, i) = hash(b.data, i)

This is OK for my own use cases, since I use dictionaries containing only keys of type A or only keys of type B. However, a future user of the package may want to construct a Dict{Union{A, B}, T}, where the keys can be of either type. This becomes problematic, because the the hash function doesn’t distinguish the two types, so the hashes will collide whenever a.data and b.data are the same.

Is there any style guide or informal advice regarding such practices with custom hash functions?

Sukera · December 1, 2023, 11:09am

It depends on whether you consider the type important for equality or not. If you do, you have to incorporate the type into the hash as well, if you don’t, you don’t need to.

fredrikekre · December 1, 2023, 11:16am

Hash equality does not imply object equality, see docstrings for hash and isequal, for example.

If the hash don’t distinguish between your A and B types above that only means a higher probability for hash collisions. It is still fine to have A(1) and B(1) in the dictionary; they will hash to the same value, but then isequal will distinguish them.

julia> struct A
           data::Int
       end
       Base.:(==)(a1::A, a2::A) = a1.data == a2.data
       Base.hash(a::A, h::UInt) = hash(a.data, h)

       struct B
           data::Int
       end
       Base.:(==)(b1::B, a2::B) = b1.data == b2.data
       Base.hash(b::B, h::UInt) = hash(b.data, h)

julia> a = A(1); b = B(1);

julia> hash(a) == hash(b)
true

julia> a == b
false

julia> Set((a, b))
Set{Any} with 2 elements:
  A(1)
  B(1)

Topic		Replies	Views
Hash of Dict with custom type as keys General Usage dictionary	4	1365	September 20, 2021
Defining a custom hash function General Usage	7	1240	December 20, 2021
Hashing of different types with same contents General Usage	2	289	February 8, 2021
Comparing Julia structs Performance question , struct	8	1436	July 13, 2023
Inconsistent hashing of types containing `Set` or `Array` fields General Usage	5	545	March 22, 2018

Should custom hash functions distinguish types?

Related topics