# Intersect of 2 arrays

I have 2 arrays of Dict containing from and to items, both of type ZonedDateTime (although for our example they could be integers, doesn’t matter). They are basically 2 arrays of ranges.

What would be the simplest/the most efficient way to determine whether they intersect?

Thank you,

Intersect in which sense? Have the same keys? Have the same values? Have the same pairs of key and value? The intersection is about the Dicts having the same keys/values, or the keys/values itself intersecting with the keys/values of another Dict?

1 Like

Can you provide an MWE, please? It would be much easier if you give a small example of what you have and what you are trying to achieve.

1 Like

Let’s assume “from” and “to” are integers (so that’s easier for me to write an example).

A = [Dict(“from”=>1, “to”=>20), Dict(“from”=>51, “to”=>100)]
B = [Dict(“from”=>21, “to”=>50), Dict(“from”=>99, “to”=>105)]

In this case, they do intersect with 2 values, 99 and 100.

In another example:

C = [Dict(“from”=>1, “to”=>20), Dict(“from”=>51, “to”=>100)]
D = [Dict(“from”=>21, “to”=>50), Dict(“from”=>101, “to”=>125)]

They don’t intersect.

Thank you.

Why are you using `Dict`s for the ranges? Why not use `UnitRange{Int64}` (or in your real case `UnitRange{ZonedDateTime}`).

The ranges inside the same array are mutually exclusive (i.e., do not share any numbers?) the arrays are sorted? All of this is relevant for knowing what is the most efficient algorithm. If the answer to both is “yes”, then the code below should suffice.

``````A = [1:20, 51:100]
B = [21:50, 99:105]
C = [1:20, 51:100]
D = [21:50, 101:125]

function list_intersections(A, B)
idx_a, idx_b = 1, 1
len_a, len_b = length.((a, b))
intersects = UnitRange{Int64}[]
while idx_a <= len_a && idx_b <= b
if last(A[idx_a]) < first(B[idx_b])
idx_a += 1
elseif last(B[idx_b]) < first(A[idx_a])
idx_b += 1
else
r = intersect(A[idx_a], B[idx_b])
!isempty(r) && push!(intersects, r)
if last(A[idx_a]) < last(B[idx_b])
idx_a += 1
else
idx_b += 1
end
end
end
return intersects
end

list_intersections(A, B)
list_intersections(C, D)
``````

Gives the expected results.

2 Likes

If you can do not mind use other structures than `Dict` (which is rather inefficient in this case), you can use nice Intervals.jl package. Something like this

``````function combineintervals(A, B)
res = Vector{eltype(A)}()
for a in A, b in B
c = intersect(a, b)
isempty(c) || push!(res, c)
end
return res
end

A = [1..20, 51..100]
B = [21..50, 101..125]

julia> combineintervals(A, B)
Interval{Int64, Closed, Closed}[]

julia> combineintervals(A, [21..50, 99..125])
1-element Vector{Interval{Int64, Closed, Closed}}:
Interval{Int64, Closed, Closed}(99, 100)
``````

Using this package makes sense also if you are working with seconds resolution, since you can use for example `close, open` intervals, for example

``````using Dates

julia> Interval{Closed, Open}(DateTime("2021-01-01"), DateTime("2021-02-01"))
Interval{DateTime, Closed, Open}(DateTime("2021-01-01T00:00:00"), DateTime("2021-02-01T00:00:00"
))
``````

and still be able to get correct results without worrying of edge cases.

4 Likes

I had some time at lunch; and always nice to learn never-heard packages like Intervals.

``````struct FromTo
from
to
function FromTo(f,t)
@assert t>=f "`To` has to be >= `From`"
new(f,t)
end
end

struct FromToCollections
l::Array{FromTo}
end

A = FromToCollections([FromTo(1,20),FromTo(51,100)])
B = FromToCollections([FromTo(21,50),FromTo(99,105)])
C = FromToCollections([FromTo(1,20), FromTo(51, 100)])
D = FromToCollections([FromTo(21, 50), FromTo(101,125)])

function intersect(A::FromTo, B::FromTo)
# The complement case seems to be easier to define
return !((A.to < B.from) ||(B.to < A.from))
end

function intersect(As::FromToCollections, Bs::FromToCollections)
for a in As.l, b in Bs.l
intersect(a,b) && return true
end
return false
end
``````
``````julia> intersect(A,B)
true

julia> intersect(C,D)
false
``````

Generally I like to define my own type, then I can defined what I meant `intersect` for my own type at a very unclutter way, and build my way up.

Edit:
on a second thought, probably should use a different name for intersect since it return “are they intersect” and doesn’t return “the intersect”.

Also, `intersect` is already defined by `Base` because `Base` already has `UnitRange`s.

Sincerely, unless there is a good reason, this seems to be reinventing the wheel. Often an interval is just two values that have ordering functions defined for them, and there is a lot of code that already works over both `UnitRange` and the types of `Intervals.jl` (that also support the concept of closed and open ranges).

1 Like