Find indices meeting multiple criteria searching from multiple arrays

dalarev · December 3, 2020, 11:36pm

How do I find the indices that meet both A > 0 and B > 0?

In this MWE, I want the answer CartesianIndex(3,3), since that’s the only location that satisfies both conditions.

julia> A = repeat(-1:1',1,3)
3×3 Array{Int64,2}:
 -1  -1  -1
  0   0   0
  1   1   1

julia> B = A'
3×3 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
 -1  0  1
 -1  0  1
 -1  0  1

Maybe there is some way to do this with findall but the following was not working for me:

ind = findall(x -> x>0,A) && findall(x -> x>0,B)

cgeoga · December 3, 2020, 11:51pm

I make no claim that this is fastest, but intersect instead of && seems to fit the bill: ind = intersect(findall(x -> x>0,A), findall(x -> x>0,B).

dalarev · December 4, 2020, 12:38am

Nice, was not aware of intersect. Thanks!

nilshg · December 4, 2020, 7:33am

I would probably write it as

julia> findall((A .> 0) .& (B .> 0))
1-element Vector{CartesianIndex{2}}:
 CartesianIndex(3, 3)

(or forego the findall if you can work with the Boolean mask)

Sukera · December 4, 2020, 2:15pm

This creates unnecessary extra closures - >(0) already creates a curried version

nilshg:

I would probably write it as
julia> findall((A .> 0) .& (B .> 0))
1-element Vector{CartesianIndex{2}}:
 CartesianIndex(3, 3)
(or forego the findall if you can work with the Boolean mask)

This has to allocate the resulting mask, which is potentially slower (since it has to iterate twice, once to create the mask and once to find the indices).

julia> using BenchmarkTools

julia> @btime intersect(findall(x -> x>0,$A), findall(x -> x>0,$B))
  885.417 ns (19 allocations: 1.84 KiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> @btime intersect(findall(>(0),$A), findall(>(0),$B))
  868.421 ns (18 allocations: 1.83 KiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> greater0(A,B) = findall((A .> 0) .& (B .> 0))
greater0 (generic function with 1 method)

julia> @btime greater0($A,$B)
  112.854 ns (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

If you want to go for speed, something like this might be best, which stays fast even with huge arrays:

julia> findGreater0(A,B) = [ idx for idx in CartesianIndices(A) if @inbounds (A[idx] > 0 && B[idx] > 0) ]
findGreater0 (generic function with 1 method)

julia> A = repeat(-1:1',1,3)
3×3 Array{Int64,2}:
 -1  -1  -1
  0   0   0
  1   1   1

julia> B = A';

julia> @btime findGreater0($A,$B)
  93.085 ns (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> B = copy(A'); # manifest the adjoint

julia> @btime findGreater0($A,$B)
  95.364 ns (3 allocations: 240 bytes) # staying fast!
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)

julia> A = zeros(Int,10000,10000);

julia> A[1,:] .= -1;

julia> A[end,:] .= 1;

julia> B = A';

julia> @btime greater0($A,$B)          
  1.230 s (5 allocations: 11.93 MiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

julia> @btime findGreater0($A,$B)
  116.800 ms (3 allocations: 240 bytes)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

Julias Adjoint is lazy, so that’s why creating and iterating the bitmask becomes very sluggish - the memory is not accessed in an optimal manner. Iterating with CartesianIndices skips that problem entirely.

Manifesting the adjoint makes it almost as fast as the generator version - the difference being the extra iteration that has to be done as well as the big allocation (that might not be possible if we run out of memory):

julia> B = copy(A'); # manifest the adjoint (10000x10000 array)

julia> @btime greater0($A,$B)
  128.834 ms (5 allocations: 11.93 MiB)
1-element Array{CartesianIndex{2},1}:
 CartesianIndex(10000, 10000)

Note that there’s not really much more that could be done here - since we don’t know when the condition will be true, SIMD will be a little hard to do since it’s effectively a coin flip whether or not a given index will be included or not.

dalarev · December 4, 2020, 3:38pm

Thanks for this. I’ll be using this function since my arrays won’t be large.

leon · October 23, 2021, 3:49pm

Best solution for speed for sure! Many thanks for sharing.

Topic		Replies	Views
How to obtain indices of an array satisfying boolean condition New to Julia	20	21441	December 1, 2022
Finding the index where the values in one array are greater than the values in another array New to Julia	2	3034	July 3, 2020
Array comparison New to Julia	4	3060	November 8, 2020
Indices of intersection of two arrays General Usage arrays	25	10740	April 19, 2024
Findall() but returning a bool array General Usage question	3	573	August 6, 2021

Find indices meeting multiple criteria searching from multiple arrays

Related topics