WeakKeyDict contains finalized elements?

I don’t know if this is intended, but it seems that WeakKeyDicts can contain objects, for which finalizers have been run:

C = WeakKeyDict()

mutable struct T
  a::BigInt
  f::Bool
  
  function T(a::BigInt)
    z = new(a)
    z.f = false
    C[z] = true
    finalizer(z, _finalize)
    return z
  end
end

function _finalize(a::T)
  a.f = true
end

Then one can produce for example

julia> A = [ T((BigInt(2)^i)) for i in 1:100];

julia> [ T((BigInt(3)^i)) for i in 1:1000];

julia> [ T((BigInt(3)^i)) for i in 1:1000];

julia> [ T((BigInt(3)^i)) for i in 1:1000];

julia> gc()

julia> C
WeakKeyDict{Any,Any} with 3100 entries:
  Error showing value of type WeakKeyDict{Any,Any}:
ERROR: InexactError()
Stacktrace:
 [1] ndigits0z(::BigInt, ::Int64) at ./gmp.jl:610
 [2] ndigits at ./gmp.jl:624 [inlined]
 [3] base(::Int64, ::BigInt) at ./gmp.jl:583
 [4] show(::IOContext{Base.AbstractIOBuffer{Array{UInt8,1}}}, ::BigInt) at ./gmp.jl:569
 [5] show_default(::IOContext{Base.AbstractIOBuffer{Array{UInt8,1}}}, ::Any) at ./show.jl:140
 [6] show(::IOContext{Base.AbstractIOBuffer{Array{UInt8,1}}}, ::Any) at ./show.jl:125
 [7] #sprint#228(::IOContext{Base.Terminals.TTYTerminal}, ::Function, ::Int64, ::Function, ::T, ::Vararg{T,N} where N) at ./strings/io.jl:64
 [8] (::Base.#kw##sprint)(::Array{Any,1}, ::Base.#sprint, ::Int64, ::Function, ::T, ::Vararg{T,N} where N) at ./<missing>:0
 [9] show(::IOContext{Base.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::WeakKeyDict{Any,Any}) at ./replutil.jl:64
 [10] display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::MIME{Symbol("text/plain")}, ::WeakKeyDict{Any,Any}) at ./REPL.jl:122
 [11] display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::WeakKeyDict{Any,Any}) at ./REPL.jl:125
 [12] display(::WeakKeyDict{Any,Any}) at ./multimedia.jl:194
 [13] eval(::Module, ::Any) at ./boot.jl:235
 [14] print_response(::Base.Terminals.TTYTerminal, ::Any, ::Void, ::Bool, ::Bool, ::Void) at ./REPL.jl:144
 [15] print_response(::Base.REPL.LineEditREPL, ::Any, ::Void, ::Bool, ::Bool) at ./REPL.jl:129
 [16] (::Base.REPL.#do_respond#16{Bool,Base.REPL.##26#36{Base.REPL.LineEditREPL,Base.REPL.REPLHistoryProvider},Base.REPL.LineEditREPL,Base.LineEdit.Prompt})(::Base.LineEdit.MIState, ::Base.AbstractIOBuffer{Array{UInt8,1}}, ::Bool) at ./REPL.jl:646

Note that this fails because some of the BigInt’s have been finalized. If you look at the keys of C, you will find that the field f is true.

Is this intended?

So if I understand the issue here correctly

  1. Key is added to WeakKeyDict
  2. All other references are dropped and gc() is run.
  3. finalizer of key is run, but object is still in dictionary.

The expect behaviour is that the object is also removed from the keys of the WeakKeyDict, which I agree is the expected behaviour after looking at the comment in https://github.com/JuliaLang/julia/blob/2421dd5f6892c96f92e77b62ac71e1356f94c4f2/base/weakkeydict.jl#L23

AFAIK there is no guarantee for the order of finalizers or that all finalizers are run at once.
@yuyichao might know more and might be able to explain what is happening here.

@vchuravy When am I allowed to open an issue on github? I really need to know if WeakKeyDicts are supposed to only work with objects that do not have finalizers attached.

Issue’s are only meant to track bugs and are supposed to be actionable. Your first instinct to post here is correct.

However, if this has you both puzzled and you both read the documentation, then I’d say it’s likely a documentation issue.

And now?

I think that your issue is that you expected finalizers (for the same object) to be run LIFO. Indeed, this is the behavior that I implicitly expected as well (finalizers should be a stack!).

If you try your example multiple times (calling gc() again, waiting) you will see that the keys slowly vanish, since weakrefdict added a finalizer to remove them, and somehow decided to not run all the finalizers for each T at the same time.

I have not seen any segfaults during this, so I would hope that the object is only free()'ed after the last finalizer is called.

Anyone who knows the gc code who can tell us in what order finalizers for the same object are run and why this was decided?

More importantly, is a weakkeydict kept alive by its keys? This would be a serious unexpected memory leak for some user code, and the weakkeydict code kinda looks like it (each key gets a finalizer, which is a closure containing a reference to the weakkeydict- so I would either expect the weakkeydict to survive until all its keys are dead, or the gc to segfault, or some magic to happen).

I expect that once a finalizer is run for an object, then this object is not the key of a WeakKeyDict anymore. But exactly this is happening here.