Properly implement Base.hash() for custom type

Consider the following code:

@enum WeaponType CloseRange Reloadable

struct GameWeapon
    damage::Int32
    type::WeaponType
end

Reading the documentation, it says:

New types should implement the 2-argument form, typically by calling the 2-argument hash method recursively in order to mix hashes of the contents with each other (and with h ).

Is this how you would implement the Base.hash() based on the documentation?

function ==(x::GameWeapon, y::GameWeapon)
    return (x.damage === y.damage) &&
           (x.type === y.type)
end

function hash(weapon::GameWeapon, h::UInt = 0x01234567)
    hash_of_weapon_attribs = hash(hash(weapon.damage,h),hash(weapon.type,h))
    return hash(hash_of_weapon_attribs,h)
end

If not, how would you do it?

1 Like

I don’t think there’s any reason to supply a default for the second argument (the base hash(x::Any) method in Base will do this for you). And you can probably simplify the implementation a bit to something more like:

function hash(weapon::GameWeapon, h::UInt)
  hash(weapon.type, hash(weapon.damage, h))
end

This implementation seems very generic. Is there a sugar that defines this implementation for a given struct?

In the documentation it says:

New types should implement the 2-argument form, typically by calling the 2-argument hash method recursively

Shouldn’t weapon.type be hashed as well?

function hash(weapon::GameWeapon, h::UInt)
  return hash(hash(weapon.type,h), hash(weapon.damage, h))
end

There is AutoHashEquals.jl.

It is hashed, just not with h as seed but instead using the result of hash(weapon.damage, h) as the seed.

FWIW, AutoHashEquals.jl would define:

function Base.hash(g::GameWeapon, h::UInt)
    hash(g.type, hash(g.damage, hash(:GameWeapon, h)))
end
1 Like

Is writing it the way AutoHashEquals.jl does any better than what rdeits suggested? I’m not criticizing his answer, I just want a better understanding of which method is better.

Hashing the type too (the way AutoHashEquals does) is probably a good idea, since it will prevent hash collisions between two different structs that happen to have the same fields. On reflection, I’d probably recommend what @fredrikekre wrote over what I posted above.

Out of curiosity, is there anything wrong with the way I did the hash()? The part of recursive may have confused me.

I see the problem, just as you mentioned.

@enum WeaponType CloseRange Reloadable

struct GameWeapon
    damage::Int32
    type::WeaponType
end

struct Weapon
    damage::Int32
    type::WeaponType
end
standard_sword = GameWeapon(25,CloseRange)
println(hash(standard_sword))

standard_sword = Weapon(25,CloseRange)
println(hash(standard_sword))

Both of these produce the same hash code. Given that they’re different struct’s they should produce different hash codes, correct?

1 Like

Not necessarily, it just decreases the risk of hash collisions. Even though they have the same hash you can use both as separate keys in Dict for example

julia> g1 = GameWeapon(25,CloseRange);

julia> g2 = Weapon(25,CloseRange);

julia> hash(g1)
0x4f2c71de34f6c410

julia> hash(g2)
0x4f2c71de34f6c410

julia> d = Dict();

julia> d[g1] = "g1";

julia> d[g2]
ERROR: KeyError: key Weapon(25, CloseRange) not found

julia> d[g2] = "g2";

julia> d
Dict{Any, Any} with 2 entries:
  GameWeapon(25, CloseRange) => "g1"
  Weapon(25, CloseRange)     => "g2"

There is also for example

julia> hash(1)
0x5bca7c69b794f8ce

julia> hash(1.0)
0x5bca7c69b794f8ce

although this is an Int and a Float64. However, in this case you can’t have them as two keys in a Dict which is a bit weird, not sure what is up with that.

julia> d = Dict();

julia> d[1] = "Int"
"Int"

julia> d[1.0]  # !!!
"Int"
1 Like

Ah! Thanks for the explanation.

My problem here is that I was confused about this part:

New types should implement the 2-argument form, typically by calling the 2-argument hash method
recursively in order to mix hashes of the contents with each other (and with h ).

I don’t mind using AutoHashEquals.jl, but I would like to understand for myself how to properly implement Base.hash() according to the documentation.

The part I misunderstood was the recursive.

It seems to produce a unique hash, AutoHashEquals.jl combines the struct name as a symbol along with the required fields, correct?

1 Like

It would be great to add an example to make this clear then, do you want to contribute with that? Here is the docstring.


Yes, but there is still no guarante it will be a unique value, e.g.

module A
struct Foo
    x::Int
end
end

module B
struct Foo
    x::Int
end
end

would give the same hash for A.Foo(1) and B.Foo(1). I don’t think it matters that much though.

It’s funny you mention that, because I suggested it a few months ago :smiley: .

I’m still learning so I don’t think that I’d be able to produce an accurate example. Perhaps someone with deep knowledge of Julia might update the documentation.