Find indices meeting multiple criteria searching from multiple arrays

How do I find the indices that meet both A > 0 and B > 0?

In this MWE, I want the answer CartesianIndex(3,3), since that’s the only location that satisfies both conditions.

julia> A = repeat(-1:1',1,3)
3×3 Array{Int64,2}:
 -1  -1  -1
  0   0   0
  1   1   1

julia> B = A'
3×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
 -1  0  1
 -1  0  1
 -1  0  1

Maybe there is some way to do this with findall but the following was not working for me:

ind = findall(x -> x>0,A) && findall(x -> x>0,B)

I make no claim that this is fastest, but intersect instead of && seems to fit the bill: ind = intersect(findall(x -> x>0,A), findall(x -> x>0,B).

Nice, was not aware of intersect. Thanks!

I would probably write it as

julia> findall((A .> 0) .& (B .> 0))
1-element Vector{CartesianIndex{2}}:
 CartesianIndex(3, 3)

(or forego the findall if you can work with the Boolean mask)

4 Likes

This creates unnecessary extra closures - >(0) already creates a curried version :slight_smile:

This has to allocate the resulting mask, which is potentially slower (since it has to iterate twice, once to create the mask and once to find the indices).

julia> using BenchmarkTools

julia> @btime intersect(findall(x -> x>0,$A), findall(x -> x>0,$B))
  885.417 ns (19 allocations: 1.84 KiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> @btime intersect(findall(>(0),$A), findall(>(0),$B))
  868.421 ns (18 allocations: 1.83 KiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> greater0(A,B) = findall((A .> 0) .& (B .> 0))
greater0 (generic function with 1 method)

julia> @btime greater0($A,$B)
  112.854 ns (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

If you want to go for speed, something like this might be best, which stays fast even with huge arrays:

julia> findGreater0(A,B) = [ idx for idx in CartesianIndices(A) if @inbounds (A[idx] > 0 && B[idx] > 0) ]
findGreater0 (generic function with 1 method)

julia> A = repeat(-1:1',1,3)
3×3 Array{Int64,2}:
 -1  -1  -1
  0   0   0
  1   1   1

julia> B = A';

julia> @btime findGreater0($A,$B)
  93.085 ns (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> B = copy(A'); # manifest the adjoint

julia> @btime findGreater0($A,$B)
  95.364 ns (3 allocations: 240 bytes) # staying fast!
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> A = zeros(Int,10000,10000);

julia> A[1,:] .= -1;

julia> A[end,:] .= 1;

julia> B = A';

julia> @btime greater0($A,$B)          
  1.230 s (5 allocations: 11.93 MiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

julia> @btime findGreater0($A,$B)
  116.800 ms (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

Julias Adjoint is lazy, so that’s why creating and iterating the bitmask becomes very sluggish - the memory is not accessed in an optimal manner. Iterating with CartesianIndices skips that problem entirely.

Manifesting the adjoint makes it almost as fast as the generator version - the difference being the extra iteration that has to be done as well as the big allocation (that might not be possible if we run out of memory):

julia> B = copy(A'); # manifest the adjoint (10000x10000 array)

julia> @btime greater0($A,$B)
  128.834 ms (5 allocations: 11.93 MiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

Note that there’s not really much more that could be done here - since we don’t know when the condition will be true, SIMD will be a little hard to do since it’s effectively a coin flip whether or not a given index will be included or not.

4 Likes

Thanks for this. I’ll be using this function since my arrays won’t be large.

Best solution for speed for sure! Many thanks for sharing.