Failure to match a set element

Hi,

I found some unexpected behavour when checking set membership.
Specifically, I get a key error when I try to pop! a known member of a set.

The problem does not occur in a pattern that I could predict, but I prepared a script that reproduces it consistently in my environment.

This is probably a bug, but I would like to know whether this is a known issue and whether I should log it.

  1. versioninfo():
    Julia Version 1.9.4
    Commit 8e5136fa29 (2023-11-14 08:46 UTC)
    Build Info:
    Official https://julialang.org/ release
    Platform Info:
    OS: Windows (x86_64-w64-mingw32)
    CPU: 8 Γ— Intel(R) Coreβ„’ i7-4790 CPU @ 3.60GHz
    WORD_SIZE: 64
    LIBM: libopenlibm
    LLVM: libLLVM-14.0.6 (ORCJIT, haswell)
    Threads: 8 on 8 virtual cores
    Environment:
    JULIA_NUM_THREADS = 8
    JULIA_EDITOR = code

  2. Julia was installed using the windows installer.

  3. Code and error message:

int_vec = [[9, 8, 2, 9, 3, 4, 1, 4, 6],
            [2, 5],
            [1, 4, 9, 8, 2, 3, 6, 3, 7],
            [3, 2, 10, 9, 9, 3, 2, 1, 2, 5],
            [8, 6, 1],
            [7, 3, 10, 7, 8, 5, 3, 10],
            [5, 6, 6, 8],
            [3],
            [6, 10, 2, 2, 7],
            [2, 2, 7],
            [7, 1],
            [7],
            [10, 6, 3, 1, 1, 2, 1],
            [5, 6, 2, 10, 4, 1],
            [6, 7],
            [5, 8],
            [5, 7, 7, 5],
            [6, 6, 9, 6, 9, 1, 1, 5],
            [2, 5, 10, 10]]

function check_length(subset,element)
    return length(subset) == length(element)
end

function test(input_vector)
    input_set = Set(input_vector)
    s = Set()
    push!(s,Set((pop!(input_set),)))
    while !isempty(input_set)
        new_element = pop!(input_set)
        ss = Set(x for x in s if check_length(x,new_element))
        if isempty(ss)
            push!(s, Set((new_element,)))
        elseif length(ss) == 1
            myst = pop!(ss)
            push!(myst,new_element)
        else
            for x in ss
                _ = pop!(s,x)
            end
            su = reduce(union, ss)
            push!(su,new_element)
            push!(s,su)
        end
    end
    return s
end

The error eventually occurs in line 39:
_ = pop!(s,x)
If I use setdiff, or setdiff! here, it also fails to remove the element.

The error message is:

ERROR: KeyError: key Set([[1, 4, 9, 8, 2, 3, 6, 3, 7], [10, 6, 3, 1, 1, 2, 1], [3], [8, 6, 1], [5, 7, 7, 5]]) not found
Stacktrace:
 [1] pop!(h::Dict{Any, Nothing}, key::Set{Vector{Int64}})
   @ Base .\dict.jl:590
 [2] pop!(s::Set{Any}, x::Set{Vector{Int64}})
   @ Base .\set.jl:104
 [3] test(input_vector::Vector{Vector{Int64}})
   @ Main c:\Users\Use\OneDrive\Development\Julia\Projects\AoC2023\test.jl:39
 [4] top-level scope
   @ REPL[4]:1

I made some modifications to your function to capture the offending variables.

julia> function test(input_vector)
           input_set = Set(input_vector)
           s = Set()
           push!(s,Set((pop!(input_set),)))
           while !isempty(input_set)
               new_element = pop!(input_set)
               ss = Set(x for x in s if check_length(x,new_element))
               if isempty(ss)
                   push!(s, Set((new_element,)))
               elseif length(ss) == 1
                   myst = pop!(ss)
                   push!(myst,new_element)
               else
                   for x in ss
                       try
                           _ = pop!(s,x)
                       catch
                           @error "Failed to pop!" s x
                           return s,x
                       end
                   end
                   su = reduce(union, ss)
                   push!(su,new_element)
                   push!(s,su)
               end
           end
           return s
       end
test (generic function with 1 method)

julia> s,x = test(int_vec)
β”Œ Error: Failed to pop!
β”‚   s =
β”‚    Set{Any} with 8 elements:
β”‚      Set([[5, 6, 6, 8]])
β”‚      Set([[5, 8]])
β”‚      Set([[1, 4, 9, 8, 2, 3, 6, 3, 7], [10, 6, 3, 1, 1, 2, 1], [3], [8, 6, 1], [5, 7, 7, 5]])
β”‚      Set([[6, 6, 9, 6, 9, 1, 1, 5]])
β”‚      Set([[6, 7]])
β”‚      Set([[2, 5, 10, 10]])
β”‚      Set([[5, 6, 2, 10, 4, 1]])
β”‚      Set([[2, 5], [7, 1], [2, 2, 7], [9, 8, 2, 9, 3, 4, 1, 4, 6], [7]])
β”‚   x =
β”‚    Set{Vector{Int64}} with 5 elements:
β”‚      [1, 4, 9, 8, 2, 3, 6, 3, 7]
β”‚      [10, 6, 3, 1, 1, 2, 1]
β”‚      [3]
β”‚      [8, 6, 1]
β”‚      [5, 7, 7, 5]
β”” @ Main REPL[114]:18
(Set(Any[Set([[5, 6, 6, 8]]), Set([[5, 8]]), Set([[1, 4, 9, 8, 2, 3, 6, 3, 7], [10, 6, 3, 1, 1, 2, 1], [3], [8, 6, 1], [5, 7, 7, 5]]), Set([[6, 6, 9, 6, 9, 1, 1, 5]]), Set([[6, 7]]), Set([[2, 5, 10, 10]]), Set([[5, 6, 2, 10, 4, 1]]), Set([[2, 5], [7, 1], [2, 2, 7], [9, 8, 2, 9, 3, 4, 1, 4, 6], [7]])]), Set([[1, 4, 9, 8, 2, 3, 6, 3, 7], [10, 6, 3, 1, 1, 2, 1], [3], [8, 6, 1], [5, 7, 7, 5]]))

Something does seem amiss.

julia> x in s
false

julia> x in deepcopy(s)
true

julia> x in collect(s)
true

Following the call path leads to Base.ht_keyindex.

julia> import Base: hashindex, isslotempty

julia> function ht_keyindex(h::Dict{K,V}, key) where V where K
           isempty(h) && return -1
           sz = length(h.keys)
           iter = 0
           maxprobe = h.maxprobe
           maxprobe < sz || throw(AssertionError()) # This error will never trigger, but is needed for terminates_locally to be valid
           index, sh = hashindex(key, sz)
           keys = h.keys

           while true
               println(index)
               isslotempty(h,index) && return -1
               if sh == h.slots[index]
                   k = keys[index]
                   if (key ===  k || isequal(key, k))
                       return index
                   end
               end

               index = (index & (sz-1)) + 1
               (iter += 1) > maxprobe && return -1
           end
           # This line is unreachable
       end
ht_keyindex (generic function with 1 method)

julia> ht_keyindex(s.dict, x)
8
9
10
-1

julia> ht_keyindex(deepcopy(s).dict, x)
8
8

Ok. I understand what the issue is.
Mutating set elements (or dictionary keys) screws up the indexing mechanics and is very likely to cause the membership check on that element to fail.
Rehashing the set solves the problem, but there is no implicit mechanism to trigger a rehash if an element (or key) is mutated.
I know in Python immutability is enforced for key and set elements.
In Julia this is not enforced, but mutating set elements breaks things. If you want to do something like I did in the elseif statement in the example, you have to trigger a rehash explicitly.
A better practice is probably to pop the element out of the set, mutate it and push it back.

1 Like

Right. See also

An alternative is to use IdDict, which hashes based on object identity. A set is simply based on a dictionary whose values are nothing. (In fact, there is an undocumented Base.IdSet that implements this, but doesn’t cover the whole Set API.)

4 Likes

If I define a couple of missing methods:

Base.IdSet(itr) = let s = Base.IdSet(); for x in itr; push!(s, x); end; s; end
Base.pop!(s::Base.IdSet) = pop!(s, first(s))

and change your test function to replace Set with Base.IdSet, I get:

julia> test(int_vec)
Base.IdSet{Any} with 5 elements:
  Base.IdSet{Any}(IdDict{Any, Nothing}([5, 8] => nothing, [5, 7, 7, 5] => nothing, [5, 6, 6, 8] => nothing, [3, 2, 1…
  Base.IdSet{Any}(IdDict{Any, Nothing}([8, 6, 1] => nothing))
  Base.IdSet{Any}(IdDict{Any, Nothing}([7, 1] => nothing))
  Base.IdSet{Any}(IdDict{Any, Nothing}([2, 5] => nothing, [3] => nothing, [2, 2, 7] => nothing, [6, 10, 2, 2, 7] => …
  Base.IdSet{Any}(IdDict{Any, Nothing}([6, 6, 9, 6, 9, 1, 1, 5] => nothing))
2 Likes