Is there a bug in `Dict`?

bug
dictionary

#1

Sorry in advance for the long post.

I have some trouble with a bug in one of my program. I was able to reproduce it with a simpler code, and it seems to be related to the Dict structure. I am not sure if the problem is coming from my code or from the structure itself in Base.
FYI:

Julia Version 0.6.0
Commit 9036443 (2017-06-19 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-1603 v4 @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

EDIT: Behaviour is similar on a windows computer, so it seems to not be related to my setup

The idea here is just to search for all the permutations possible of the elements of Dictionary d. At some point, a problem occurs (cf the associated comment) and gives the value 0 to j. This value is supposedly from the keys of d, and should always be in 1:k (when the function is called with arg value k).

The error KeyError: key 0 not found seems to be raised for k >= 8

function mytest_inner(d::Dict{Int, Int}, i::Int)
    println(d)
    println("i = $i")
    pop!(d, i)
    for j in keys(d) # Problem is coming from here
        mytest_inner(deepcopy(d), j)
    end
end
function mytest(k::Int)
    for i in 1:k
        d = Dict(enumerate(1:k))
        mytest_inner(d, i)
    end
end
mytest(8)

The (shortened) output that illustrate this follows:


...
Dict(7=>7,4=>4,2=>2,3=>3,5=>5,8=>8,6=>6,1=>1)
i = 1
Dict(7=>7,4=>4,2=>2,3=>3,5=>5,8=>8,6=>6)
i = 7
Dict(4=>4,2=>2,3=>3,5=>5,8=>8,6=>6)
i = 4
...
Dict(3=>3,5=>5,8=>8,6=>6)
i = 5
Dict(3=>3,8=>8,6=>6)
i = 3
Dict(8=>8,6=>6)
i = 8
Dict(6=>6)
i = 6
Dict(8=>8,6=>6)
i = 6
Dict(8=>8)
i = 8
Dict(3=>3,8=>8,6=>6)
i = 0

With the last value of i, the call to d[i] is indeed raising an error.

Any help?


#2

It seems to me it is the normal behaviour


#3

Thank you for your answer.

Are you sure it is normal for a value taken among the keys of a dictionary d that are in 1:8 to be equal 0?

Am I missing something?


#4

I am also puzzled by this. Here is a simpler MWE which visualizes the recursion:

function mytest_inner(d::Dict{Int, Int}, i::Int, indent = 0)
    println_indent(indent, "popping $i from $d")
    pop!(d, i)
    println_indent(indent, "left with $d, keys $(keys(d))")
    for j in keys(d) # Problem is coming from here
        mytest_inner(deepcopy(d), j, indent + 1)
    end
end
mytest_inner(Dict(enumerate(1:8)), 1)

outputing

julia> mytest_inner(Dict(enumerate(1:8)), 1)
popping 1 from Dict(7=>7,4=>4,2=>2,3=>3,5=>5,8=>8,6=>6,1=>1)
left with Dict(7=>7,4=>4,2=>2,3=>3,5=>5,8=>8,6=>6), keys [7, 4, 2, 3, 5, 8, 6]
    popping 7 from Dict(7=>7,4=>4,2=>2,3=>3,5=>5,8=>8,6=>6)
    left with Dict(4=>4,2=>2,3=>3,5=>5,8=>8,6=>6), keys [4, 2, 3, 5, 8, 6]
        popping 4 from Dict(4=>4,2=>2,3=>3,5=>5,8=>8,6=>6)
        left with Dict(2=>2,3=>3,5=>5,8=>8,6=>6), keys [2, 3, 5, 8, 6]
            popping 2 from Dict(2=>2,3=>3,5=>5,8=>8,6=>6)
            left with Dict(3=>3,5=>5,8=>8,6=>6), keys [3, 5, 8, 6]
                popping 3 from Dict(3=>3,5=>5,8=>8,6=>6)
                left with Dict(5=>5,8=>8,6=>6), keys [5, 8, 6]
                    popping 5 from Dict(5=>5,8=>8,6=>6)
                    left with Dict(8=>8,6=>6), keys [8, 6]
                        popping 8 from Dict(8=>8,6=>6)
                        left with Dict(6=>6), keys [6]
                            popping 6 from Dict(6=>6)
                            left with Dict{Int64,Int64}(), keys Int64[]
                        popping 6 from Dict(8=>8,6=>6)
                        left with Dict(8=>8), keys [8]
                            popping 8 from Dict(8=>8)
                            left with Dict{Int64,Int64}(), keys Int64[]
                    popping 8 from Dict(5=>5,8=>8,6=>6)
                    left with Dict(5=>5,6=>6), keys [5, 6]
                        popping 5 from Dict(5=>5,6=>6)
                        left with Dict(6=>6), keys [6]
                            popping 6 from Dict(6=>6)
                            left with Dict{Int64,Int64}(), keys Int64[]
                        popping 6 from Dict(5=>5,6=>6)
                        left with Dict(5=>5), keys [5]
                            popping 5 from Dict(5=>5)
                            left with Dict{Int64,Int64}(), keys Int64[]
                    popping 6 from Dict(5=>5,8=>8,6=>6)
                    left with Dict(5=>5,8=>8), keys [5, 8]
                        popping 5 from Dict(5=>5,8=>8)
                        left with Dict(8=>8), keys [8]
                            popping 8 from Dict(8=>8)
                            left with Dict{Int64,Int64}(), keys Int64[]
                        popping 8 from Dict(5=>5,8=>8)
                        left with Dict(5=>5), keys [5]
                            popping 5 from Dict(5=>5)
                            left with Dict{Int64,Int64}(), keys Int64[]
                popping 5 from Dict(3=>3,5=>5,8=>8,6=>6)
                left with Dict(3=>3,8=>8,6=>6), keys [3, 8, 6]
                    popping 3 from Dict(3=>3,8=>8,6=>6)
                    left with Dict(8=>8,6=>6), keys [8, 6]
                        popping 8 from Dict(8=>8,6=>6)
                        left with Dict(6=>6), keys [6]
                            popping 6 from Dict(6=>6)
                            left with Dict{Int64,Int64}(), keys Int64[]
                        popping 6 from Dict(8=>8,6=>6)
                        left with Dict(8=>8), keys [8]
                            popping 8 from Dict(8=>8)
                            left with Dict{Int64,Int64}(), keys Int64[]
                    popping 139931758451088 from Dict(3=>3,8=>8,6=>6)
ERROR: KeyError: key 139931758451088 not found
Stacktrace:
 [1] pop!(::Dict{Int64,Int64}, ::Int64) at ./dict.jl:539
 [2] mytest_inner(::Dict{Int64,Int64}, ::Int64, ::Int64) at ./REPL[35]:3
 [3] mytest_inner(::Dict{Int64,Int64}, ::Int64, ::Int64) at ./REPL[35]:6 (repeats 5
 times)                                                                           
 [4] mytest_inner(::Dict{Int64,Int64}, ::Int64) at ./REPL[35]:2

#5

@Tamas_Papp Thank you for the simplification of the code and lisibility of the output.

FYI, when I use this code in atom the KeyError corresponds to 0 in Atom REPL and to a big number (such as 140437488551752) in the Julia REPL.
It looks like an access to wrong register (or the register has been erased).


#6

There is some problem with deepcopy in loop scope. Below f1 and f2 work OK, and f3 generates an error:

function f1(d::Dict{Int, Int}, i::Int, indent = 0)
    pop!(d, i)
    z = deepcopy(d)
    for j in keys(d) # Problem is coming from here
        f1(z, j, indent + 1)
    end
end

function f2(d::Dict{Int, Int}, i::Int, indent = 0)
    pop!(d, i)
    for j in keys(d) # Problem is coming from here
        f2(merge(Dict{Int,Int}(), d), j, indent + 1)
    end
end

function f3(d::Dict{Int, Int}, i::Int, indent = 0)
    pop!(d, i)
    for j in keys(d) # Problem is coming from here
        z = deepcopy(d)
        f3(z, j, indent + 1)
    end
end

#7

@Azzaare: This looks like a bug. Perhaps you could open an issue.


#8

I believe that the problem is in dict.jl:

    function Dict{K,V}(d::Dict{K,V}) where V where K
        if d.ndel > 0
            rehash!(d)
        end
        @assert d.ndel == 0
        new(copy(d.slots), copy(d.keys), copy(d.vals), 0, d.count, d.age, d.idxfloor,
            d.maxprobe)
    end

which modifies d when d.ndel>0, and it should not. E.g. this fixes the problem:

function f3(d::Dict{Int, Int}, i::Int, indent = 0)
    pop!(d, i)
    Base.rehash!(d)
    for j in keys(d) # Problem is coming from here
        f3(deepcopy(d), j, indent + 1)
    end
end

#9

of course this is a hack and I believe a bug in dict.jl should be fixed.


#10

A perfect excuse for a PR in the spirit of


#11

I have already redeemed my Hacktoberfest tee. So @Azzaare, if you want go ahead and make a PR. If you do not have time I will do it - just let me know.


#12

@bkamins If you have time to do it, I would be glad. I will be quite busy for a couple of days. And I don’t really want to do it through my phone.


#13

PR https://github.com/JuliaLang/julia/pull/24345


#14

Thank you so much @bkamins


#15

I don’t know if this is the same problem, but at least i get the same error message.

I am trying to create a type which can be used with sets. While creating and pushing to the Set works, pop! returns a KeyError. Set internally uses a Dict.

struct Test
       field1::Vector{Int}
end
Base.start(::Test) = false
Base.next(A::Test, state) = (A, true)
Base.done(::Test, state) = state
Base.eltype(::Test) = Test
julia> s = Set(Test([1,2]))
Set(Test[Test([1, 2])])

julia> push!(s, Test([3,4]))
Set(Test[Test([1, 2]), Test([3, 4])])

julia> s
Set(Test[Test([1, 2]), Test([3, 4])])

julia> pop!(s, Test([1, 2]))
ERROR: KeyError: key Test([1, 2]) not found
Stacktrace:
 [1] pop!(::Dict{Test,Void}, ::Test) at .\dict.jl:539
 [2] pop!(::Set{Test}, ::Test) at .\set.jl:36
 [3] eval(::Module, ::Any) at .\boot.jl:235

#16

It’s not a bug and not the same problem. You need to define hash and isequal/== if you don’t want the default hashing, which uses object_id and ===.


#17

yuyichao is correct, the information you need is in the docs, but I find it easy to miss, here are a few sections that discuss it

https://docs.julialang.org/en/stable/stdlib/base/#Base.hash

https://docs.julialang.org/en/stable/stdlib/collections/#Associative-Collections-1

https://docs.julialang.org/en/stable/stdlib/base/#Base.isequal-Tuple{Any,Any}

The package AutoHashEquals has a macro do do this for you, and critically, it’s readme has an example of how to do it yourself.


#18

This part of the thread should be split off as it is offtopic here.

Thanks a lot for the solution to my problem.
But how was i supposed to find that solution in the docs? I didn’t even know, that hash and isequal is needed to find the key of a dict (makes sense, of course).

I think interfaces need a lot better documentation. Currently, only Iterators and Arrays have documented Interfaces. Sets are missing, and the same goes for Number, as far as i know.

I think it would be nice to have a macro @interface to get the signatures of all functions which belong to the required interface of a certain type - similar to @which or apropos().

julia> @interface AbstractArray
size(A)
getindex(A, i:Int)
getindex(A, I::Vararg{Int, N})
setindex!(A, v, i::Int)
setindex!(A, v, I::Vararg{Int, N})