I have 2 arrays of Dict containing from and to items, both of type ZonedDateTime (although for our example they could be integers, doesn’t matter). They are basically 2 arrays of ranges.
What would be the simplest/the most efficient way to determine whether they intersect?
Thank you,
Intersect in which sense? Have the same keys? Have the same values? Have the same pairs of key and value? The intersection is about the Dicts having the same keys/values, or the keys/values itself intersecting with the keys/values of another Dict?
1 Like
Can you provide an MWE, please? It would be much easier if you give a small example of what you have and what you are trying to achieve.
1 Like
Let’s assume “from” and “to” are integers (so that’s easier for me to write an example).
A = [Dict(“from”=>1, “to”=>20), Dict(“from”=>51, “to”=>100)]
B = [Dict(“from”=>21, “to”=>50), Dict(“from”=>99, “to”=>105)]
In this case, they do intersect with 2 values, 99 and 100.
In another example:
C = [Dict(“from”=>1, “to”=>20), Dict(“from”=>51, “to”=>100)]
D = [Dict(“from”=>21, “to”=>50), Dict(“from”=>101, “to”=>125)]
They don’t intersect.
Thank you.
Why are you using Dict
s for the ranges? Why not use UnitRange{Int64}
(or in your real case UnitRange{ZonedDateTime}
).
The ranges inside the same array are mutually exclusive (i.e., do not share any numbers?) the arrays are sorted? All of this is relevant for knowing what is the most efficient algorithm. If the answer to both is “yes”, then the code below should suffice.
A = [1:20, 51:100]
B = [21:50, 99:105]
C = [1:20, 51:100]
D = [21:50, 101:125]
function list_intersections(A, B)
idx_a, idx_b = 1, 1
len_a, len_b = length.((a, b))
intersects = UnitRange{Int64}[]
while idx_a <= len_a && idx_b <= b
if last(A[idx_a]) < first(B[idx_b])
idx_a += 1
elseif last(B[idx_b]) < first(A[idx_a])
idx_b += 1
else
r = intersect(A[idx_a], B[idx_b])
!isempty(r) && push!(intersects, r)
if last(A[idx_a]) < last(B[idx_b])
idx_a += 1
else
idx_b += 1
end
end
end
return intersects
end
list_intersections(A, B)
list_intersections(C, D)
Gives the expected results.
2 Likes
If you can do not mind use other structures than Dict
(which is rather inefficient in this case), you can use nice Intervals.jl package. Something like this
function combineintervals(A, B)
res = Vector{eltype(A)}()
for a in A, b in B
c = intersect(a, b)
isempty(c) || push!(res, c)
end
return res
end
A = [1..20, 51..100]
B = [21..50, 101..125]
julia> combineintervals(A, B)
Interval{Int64, Closed, Closed}[]
julia> combineintervals(A, [21..50, 99..125])
1-element Vector{Interval{Int64, Closed, Closed}}:
Interval{Int64, Closed, Closed}(99, 100)
Using this package makes sense also if you are working with seconds resolution, since you can use for example close, open
intervals, for example
using Dates
julia> Interval{Closed, Open}(DateTime("2021-01-01"), DateTime("2021-02-01"))
Interval{DateTime, Closed, Open}(DateTime("2021-01-01T00:00:00"), DateTime("2021-02-01T00:00:00"
))
and still be able to get correct results without worrying of edge cases.
4 Likes
I had some time at lunch; and always nice to learn never-heard packages like Intervals.
struct FromTo
from
to
function FromTo(f,t)
@assert t>=f "`To` has to be >= `From`"
new(f,t)
end
end
struct FromToCollections
l::Array{FromTo}
end
A = FromToCollections([FromTo(1,20),FromTo(51,100)])
B = FromToCollections([FromTo(21,50),FromTo(99,105)])
C = FromToCollections([FromTo(1,20), FromTo(51, 100)])
D = FromToCollections([FromTo(21, 50), FromTo(101,125)])
function intersect(A::FromTo, B::FromTo)
# The complement case seems to be easier to define
return !((A.to < B.from) ||(B.to < A.from))
end
function intersect(As::FromToCollections, Bs::FromToCollections)
for a in As.l, b in Bs.l
intersect(a,b) && return true
end
return false
end
julia> intersect(A,B)
true
julia> intersect(C,D)
false
Generally I like to define my own type, then I can defined what I meant intersect
for my own type at a very unclutter way, and build my way up.
Edit:
on a second thought, probably should use a different name for intersect since it return “are they intersect” and doesn’t return “the intersect”.
Also, intersect
is already defined by Base
because Base
already has UnitRange
s.
Sincerely, unless there is a good reason, this seems to be reinventing the wheel. Often an interval is just two values that have ordering functions defined for them, and there is a lot of code that already works over both UnitRange
and the types of Intervals.jl
(that also support the concept of closed and open ranges).
1 Like