Dict's different behavior for using and include

First, I created a file named DictTest.jl with the following content:

module DictTest
data = Dict(GlobalRef(Base,:sum) => GlobalRef(Base,:prod),
    1 => GlobalRef(Base,:sum))
end

and put the file into the JULIA_LOAD_PATH.

Then in REPL:

julia> using DictTest
[ Info: Precompiling DictTest [top-level]

julia> DictTest.data
Dict{Any,GlobalRef} with 2 entries:
  :(Base.sum) => :(Base.prod)
  1           => :(Base.sum)

julia> DictTest.data[1]
:(Base.sum)

julia> DictTest.data[GlobalRef(Base,:sum)]
ERROR: KeyError: key :(Base.sum) not found
Stacktrace:
 [1] getindex(::Dict{Any,GlobalRef}, ::GlobalRef) at .\dict.jl:467
 [2] top-level scope at none:1

julia> DictTest.data.keys[10]
:(Base.sum)

julia> DictTest.data[DictTest.data.keys[10]]
ERROR: KeyError: key :(Base.sum) not found
Stacktrace:
 [1] getindex(::Dict{Any,GlobalRef}, ::GlobalRef) at .\dict.jl:467
 [2] top-level scope at none:1

The last result make me crazy.

Restart Julia, and

julia> cd(ENV["JULIA_LOAD_PATH"][1:end-1])

julia> include("DictTest.jl")
Main.DictTest

julia> DictTest.data[1]
:(Base.sum)

julia> DictTest.data[GlobalRef(Base,:sum)]
:(Base.prod)

It seem using gives a wrong result while include is ok. I have little knowledge about Julia’s pre-compile, is this a feature or a bug?

Here’s the versioninfo:

Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-9600KF CPU @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

You probably need to rehash! the dictionary after precompilation since the serialized hashes are not valid anymore.

So you can add

function __init__()
    Base.rehash!(data)
end

to your module. Also, don’t use the .keys field for dictionaries, it’s an internal field.

Specifically, see this part in the manual Modules · The Julia Language

Dictionary and set types, or in general anything that depends on the output of a hash(key) method, are a trickier case. In the common case where the keys are numbers, strings, symbols, ranges, Expr , or compositions of these types (via arrays, tuples, sets, pairs, etc.) they are safe to precompile. However, for a few other key types, such as Function or DataType and generic user-defined types where you haven’t defined a hash method, the fallback hash method depends on the memory address of the object (via its objectid ) and hence may change from run to run. If you have one of these key types, or if you aren’t sure, to be safe you can initialize this dictionary from within your __init__ function. Alternatively, you can use the IdDict dictionary type, which is specially handled by precompilation so that it is safe to initialize at compile-time.

So a better solution than I wrote above is likely to use an IdDict

4 Likes

Solved, thanks very much.