Retrieve element from a set


I want to do something that may be impossible to do. It is not very important (I have possible workarounds), but I am curious…

I have a struct that represents an edge, defined as:

struct Edge{I<Integer}

The idea is that the actual edge is represented by the tuple e, whereas p and r are some attributes of the edge that are only relevant in some particular situations. I want to store Edges in a Set, but I want to be able to check set membership using only the tuple e. For this I redefine Base.hash, Base.isequal and Base.==:

    import Base: hash, isequal, ==
    hash(edge::Edge) = hash(edge.e)
    isequal(e1::Edge,e2::Edge) = isequal(e1.e,e2.e)
    ==(e1::Edge,e2::Edge) = ==(e1.e,e2.e)

With this, I can do something like:

    a = Edge((1,2),2,true); b = Edge((1,2),3,false); c = Edge((1,3),1,true);
    S = Set([a,b,c]) 
    (1,3) in S

and I get what I want. Namely: when constructing S, b replaces a since they are the same edge, and the answer to (1,3) in S is true. So far, so good.

However, in this way the attributes of the edges are inaccesible: I can check if the edge belongs to S but not what attributes are attached to it. Essentially, I cannot extract and Edge as it is stored in S. I can, for example, use pop!(S,m) where m is either a tuple or an Edge build from the tuple and default values for p and r. This removes the corresponding Edge from S, but the output is the same m that I used as key for popping, so the attributes of the previously stored edge are lost.

A simple solution is to use a dictionary instead of a set. In particular, a Dict{Tuple{I,I},Tuple{I,Bool}}, with the tuples as keys and the attributes as values. But in this way I essentially drop the data type Edge that is useful in other parts of the code.
It is not a great deal: I can keep Edge for the things it is useful, and just store my set of edges as a Dict instead of a real Set of Edges. But I wonder if there is some trick that I can apply for having my cake and eating it. In other words: Given a set where the hash applies not the whole element but only to a part, is it possible to retrieve the complete element?


1 Like

This is not possible using only stable API and with the existing Set type. I would advice you use the Dict approach.
For fun, it’s possible to do this by relying on the internal implementation of Set:

julia> s = Set([1,2,3]);

julia> function get_element(s::Set, v)
           i = Base.ht_keyindex2!(s.dict, v)
           i < 1 ? nothing : s.dict.keys[i]
get_element (generic function with 1 method)

julia> get_element(s, 0x02)

Note that this implementation may break in any future Julia release.

Hunh. Is this a bug? What why wouldn’t we return the value as it appears in the set?

pop!(collection, key[, default])

Delete and return the mapping for key if it exists in collection

I suppose it’s not quite clear what “the mapping for key” would be in a Set, but it seems defensible that it should be the value in the set itself.

julia> s = Set((1,2,3))
Set{Int64} with 3 elements:

julia> pop!(s, 1.0)

the expression

julia>     m=(1,3)
(1, 3)

julia>     pop!(S,m)
ERROR: KeyError: key (1, 3) not found

gives me an error

Maybe this might be for you

[e for e in S if e.e==(1,3)]
[e for e in S if e==Edge{Int64}((1, 3), 991, true)]

Nice trick. I had not noticed that a Set is actually a Dict with values set to nothing. Following your approach, it is possible to use getkey:

julia> S = Set([1,2,3]);

julia> getkey(S.dict,0x02,nothing)


Indeed! My reasoning was that pop! should return the value as stored in the Set. That’s why I though that my approach was more or less acceptable.

You are right. In order to use pop!(S,(1,3)) it is necessary to define a method for comparing tuples and edges:

import Base.==
==(t::Tuple{I,I},e::Edge) where I<:Integer = ==(t,e.e)

The problem with [e for e in S if e.e==(1,3)] is that you actually check the whole set, which is something I want to avoid.

You could also define a personal ‘Set’ with values equal to the keys instead of nothing as follows

julia> D=Dict([(k,k) for k in (a,b,c)])
Dict{Edge{Int64}, Edge{Int64}} with 2 entries:
  Edge{Int64}((1, 2), … => Edge{Int64}((1, 2), 3, false)
  Edge{Int64}((1, 3), … => Edge{Int64}((1, 3), 1, true) 

julia> D[a]
Edge{Int64}((1, 2), 3, false)

julia> D[b]
Edge{Int64}((1, 2), 3, false)

julia> D[c]
Edge{Int64}((1, 3), 1, true)

julia> D[(1,2)]
Edge{Int64}((1, 2), 3, false)

Should it be nothing!?

Or is nothing the S.dict mapping!?

Thanks to @jakobnissen, pop! on a Set will begin returning the element removed in v1.11!