Hash of enum from module gives inconsistent values run to run

Hi everyone! I’m running into some weird behaviour regarding Enums. Here is a small code example:

module FakeletterModule
export
    Fakeletter, _A, _B, _C
@enum Fakeletter _A _B _C
end # module

import .FakeletterModule: _A, _B, _C
letts = [_A, _B, _C]
d = Dict{FakeletterModule.Fakeletter, Float64}(_A => 0.1, _B => 0.1, _C => 0.1)
println("enum inside module")
for l in letts
    println(l, " ", hash(l))
end
println(collect(keys(d)))

# creating enum
@enum Fakeletter _A2 _B2 _C2
letts2 = [_A2, _B2, _C2]
d2 = Dict{Fakeletter, Float64}(_A2 => 0.1, _B2 => 0.1, _C2 => 0.1)
println("enum outside module")
for l in letts2
    println(l, " ", hash(l))
end
println(collect(keys(d2)))

If I save this to a file and run it multiple times, the hashes of the enums from the module are different run to run.

# run 1
enum inside module
_A 10363278099296118406
_B 16420461885460437964
_C 15024464193530540762
Main.FakeletterModule.Fakeletter[_A, _C, _B]
enum outside module
_A2 3965072176245652185
_B2 10022255962409971743
_C2 8626258270480074541
Fakeletter[_A2, _C2, _B2]

# run 2
enum inside module
_A 6171524818653313213
_B 12228708604817632771
_C 10832710912887735569
Main.FakeletterModule.Fakeletter[_C, _B, _A]
enum outside module
_A2 3965072176245652185
_B2 10022255962409971743
_C2 8626258270480074541
Fakeletter[_A2, _C2, _B2]

Am I right that this also causes issues with using enums as dictionary keys? Specifically, the order they come out when I collect them is different run to run (this was the headache I was running into, since I was using shuffle! on the collection with a fixed rng seed but getting a different order each time).

Is this expected behaviour? I’m not that familiar with enums and hashing and all that but this seems weird to me.

Also the problem goes away if I manually define Base.hash (e.g., Bash.hash(f::Fakeletter, h::UInt) = hash(string(f), h)) within my module, but I’m not exactly sure what the proper way to define the hash of a enum as and also my test environment gives me an overwrite warning:

│ WARNING: Method definition hash(..., UInt64) in module ... at Enums.jl:210 overwritten at path/to/file:linenumber. │ ** incremental compilation may be fatally broken for this module ** (julia/base/Enums.jl at v1.9.1 · JuliaLang/julia · GitHub is what is getting overwritten; enums already get a Base.hash function for them?)

Version info:

julia> versioninfo()
Julia Version 1.9.1
Commit 147bdf428c (2023-06-07 08:27 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 24 × 13th Gen Intel(R) Core(TM) i7-13700K
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, goldmont)
  Threads: 8 on 24 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

Dicts in Julia don’t have an order; even without hash giving different values run to run, the order is not stable. You can use OrderedCollections if you need a collection with an order.

There is a fallback definition, yes, which ultimately falls back to using objectid, which may give a different result due to the module the enum is in changing run to run.

Also, hashing Base.Enum values is a bit tricky, because under the hood they are primitive types.

1 Like

You can use Integer(someenum) to convert them to a plain integer type and hash that way, which is more efficient than converting to a string. For example:

let hashseed = hash("MyEnum")
    Base.hash(x::MyEnum, h::UInt) = hash(Integer(x), h) ⊻ hashseed
end
1 Like

I see I see. Thank you!