# Set with type with custom equality

I have a type whose instance is equivalent to instances that are in some distance away:

``````struct VectorWithEpsRadius{T}
value::T
eps::Float64
end

return norm(a.value - b.value) <= a.eps
end
``````

I would like to use this inside a Set. However the behavior is not what I expect:

``````f(x) = VectorWithEpsRadius{Int64}(x, 10)
Base.hash(m::VectorWithEpsRadius{T}, h::UInt) where T = Base.hash(m.value, h)
println(f(10) == f(11)) # true
println(hash(f(10)) == hash(f(11))) #  false
``````

and

``````julia> s = Set{ChaosTools.VectorWithEpsRadius{Int64}}()

julia> push!(s, f(21))
Set{ChaosTools.VectorWithEpsRadius{Int64}} with 1 element:

julia> push!(s, f(22))
Set{ChaosTools.VectorWithEpsRadius{Int64}} with 2 elements:
``````

I want the set to contain only the first element and not the second. I want those two elements to be considered equal.

How to solve this?

Thanks

This is not an equivalence relation because it is not transitive, so that may cause problems in code that expects `==` to behave normally. (See also the discussion in `unique()` with `isapprox()` instead of `isequal()` Â· Issue #19147 Â· JuliaLang/julia Â· GitHub)

The `hash` documentation says that it should obey:

Compute an integer hash code such that `isequal(x,y)` implies `hash(x)==hash(y)`.

Your `hash` and `==` (called by `isequal`) definitions are inconsistent with this requirement, since two values that compare `==` can have different hashes. So, you should expect any algorithm that uses hashes (sets, dictionaries, â€¦) to break for your type.

Because your `==` is not an equivalence relation, itâ€™s not clear what a â€śsetâ€ť even means. For example, suppose you specify `eps=0.15` and take the numbers

``````[0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]
``````

Each one of these numbers is â€ś`==`â€ť to the next number with `eps=0.15`, so does a "setâ€™ contain only one element? But because your `==` is not transitive, most of these numbers are not â€ś`==`â€ť to one another, so should the â€śsetâ€ť contain many elements? Which ones?

Basically, you need to decide more carefully what you want to do here. For example, one â€śsolutionâ€ť would be to round all of your values to a certain number of digits, and then use the rounded values for comparisons and hashing. This would be an equivalence relation, but may or may not be what you want.

2 Likes

I donâ€™t intend to form a set out of an array. I only want limited functionality of the set and I understand that other things could break down. Basically I need a datastructure with fast insertion and lookup (no ordering needed). My hopes were to insert element into an empty set and then the second insertion would only be allowed if the second element is in the correct distance.

The datastructure behavior that I want currently works with a `RBTree{VectorWithEpsRadius}` however I was facing some bugs with `==` which I canâ€™t easily reproduce here. Thatâ€™s why I am looking for an alternative.

Rounding numbers is not an option, this is exactly what I want to avoid.

My solution is as follows. Itâ€™s not optimal but benchmarks better than my previous approach with binary tree.

``````function mypush!(set, x, abs)
previously_detected(set, x, abs) ? nothing : push!(set, x)
end

function previously_detected(set, x, abs)
for i in set
if norm(x - i) < abs
return true
end
end
return false
end
``````

I insert into the set with `mypush!` only if I am sure that I want the element there. I need to do the linear search which is not the best. However insertion into the set doesnâ€™t allocate and should happen in constant time.