# How to make a Set of real values based on rtol?

Lets say you wanted to make a `Set` of real numbers:

``````Set( [
1.0,
1+eps(),
2
] )
``````

Where for all practical purposes, this should reduce to `Set([1.0,2.0])`

Is there an existing data structure that allows something like: `Set(..., rtol=1e-8)` to provide this functionality?

(as well as maintain all the features of a `Set`)

edit: I guess what I’m asking is:

• is it possible to change the equality condition for `Set` checking?

`Set` is based on hashing, and there won’t be a hash function for your notion of equality, because it’s not transitive.

``````julia> a,b,c = [1.0 .+ i * 5e7 * eps() for i in 0:2];

julia> isapprox(a,b)
true

julia> isapprox(b,c)
true

julia> isapprox(a,c)
false
``````

Because you want to do this for real numbers, which are ordered, you should be able to keep a sorted list of members. That way, it will take log(n) time to insert an element or decide it’s already inserted, instead of n time. You should be able to use `searchsorted` for this.

4 Likes

Or for the simple,

``````function filter_approx!(cur_vector; atol=5e-2)
delete_indices = []

for (cur_index, cur_value) in enumerate(cur_vector)
is_duplicate = any(
tmp_value -> isapprox(tmp_value, cur_value, atol=atol),
cur_vector[1:cur_index-1]
)

is_duplicate || continue
push!(delete_indices, cur_index)
end

deleteat!(cur_vector, delete_indices)

cur_vector
end
``````

With,

``````tmp_vector = [1.0 .+ i * 5e7 * eps() for i in 0:2]
filter_approx!(tmp_vector)

println(tmp_vector)
``````

`>> [1.0]`

Further convenience methods,

``````function approx_push!(cur_vector, cur_value; atol=5e-2)
is_duplicate = any(
tmp_value -> isapprox(tmp_value, cur_value, atol=atol),
cur_vector
)

is_duplicate && return cur_vector
push!(cur_vector, cur_value)
end
``````
``````function approx_append!(cur_vector, other_vector; atol=5e-2)
append!(cur_vector, other_vector)
filter_approx!(cur_vector; atol=atol)

cur_vector
end
``````