# How to obtain indices of an array satisfying boolean condition

I am trying to find the optimum way of obtaining the indices of an array which satisfy a certain boolean condition. For example, if `a` is of type `Array{Float64,1}`, i wish to obtain an array of indices i for which `a[i]>0`. This task can be easily accomplished by using a loop, but that would not be optimal.

I have looked extensively in the documentation for the Base class, but didnt find something that achieves what I am aiming for. There are some nice functions such as `filter(f, a::AbstractArray)` which does not return the indices but the elements at the indices.

1 Like

`a[a .> 0]` ?

1 Like
``````julia> A = randn(10)
10-element Array{Float64,1}:
-0.10597385263640592
0.6804124752504479
-0.745552871357189
-0.8316766727532244
-0.4136441092819935
-1.7795858830337792
-0.7049887890183149
1.495347684116737
-0.05797357451997173
0.1689282220877404

julia> findall(>(0), A)
3-element Array{Int64,1}:
2
8
10
``````
5 Likes

A loop would (presumably) be optimal for the computer, but I agree that less typing should be required, and hence itās not necessarily optimal for the human.

Although you probably would have spent less time writing the loop than looking in the docs and writing this post!

3 Likes

I have read that Juliaās inline commands such as `findall` are highly optimized and always recommended over explicit loops.

1 Like

This still does not give the indices. But it was useful knowing about this notation. Thank you.

Thank you for suggesting `findall`. It worked very well. However, I think there is a slight mistake in your notation, since I get

`````` findall(>(0), a)
ERROR: MethodError: no method matching >(::Int64)
Closest candidates are:
>(::Any, ::Any) at operators.jl:286
>(::BigFloat, ::BigFloat) at mpfr.jl:756
Stacktrace:
[1] top-level scope at none:0
``````

However, the following syntax works.

`````` findall(i->(i>0.5), a)
````````

Single-argument comparisons were added in v1.2, so you must be using an older release. See https://github.com/JuliaLang/julia/blob/master/HISTORY.md#new-library-functions-2

1 Like

I see ! Thanks

I am curious where you read this. Loops are very efficient in Julia when written right (eg using `eachindex` etc).

4 Likes
``````julia> v = randn(1000);

julia> function myfindall(condition, x)
results = Int[]

for i in 1:length(x)
if condition(x[i])
push!(results, i)
end
end
return results
end
myfindall (generic function with 1 method)

julia> @btime findall(x -> x > 0, \$v);
3.438 Ī¼s (13 allocations: 8.47 KiB)

julia> @btime myfindall(x -> x > 0, \$v);
3.707 Ī¼s (9 allocations: 8.33 KiB)
``````
3 Likes

While true, the Base implementations might use tricks and optimizations that are not always obvious which often make them faster than the naive for loop.

6 Likes

But the Base implementations arenāt āprivilegedā. With some experience anyone could implement their own version that matches Base, or does even better if they have special knowledge of the specific problem.

So, @iamsuddhasattwa, you shouldnāt dismiss the possibility of writing your own implementation, though `findall` may be what you were looking for this time.

2 Likes

Hereās an example of how you can beat the Base implementation of `findall` for a special case. Letās say, for example, that you know that most of the elements in your array satisfy your condition. Then it can be more efficient to allocate a large output array in the first place rather than growing it dynamically, like `findall` does:

``````function myfindall(p, X)
out = Vector{Int}(undef, length(X))
ind = 0
@inbounds for (i, x) in pairs(X)
if p(x)
out[ind+=1] = i
end
end
resize!(out, ind)
return out
end
``````

Thereās nothing magical going on, just ordinary code you can write yourself. Now letās benchmark:

``````julia> @benchmark findall(>(0.1), r) setup=(r=rand(1000))
BenchmarkTools.Trial:
memory estimate:  16.55 KiB
allocs estimate:  14
--------------
minimum time:     8.982 Ī¼s (0.00% GC)
median time:      10.557 Ī¼s (0.00% GC)
mean time:        13.389 Ī¼s (17.23% GC)
maximum time:     8.171 ms (99.39% GC)
--------------
samples:          10000
evals/sample:     1

julia> @benchmark myfindall(>(0.1), r) setup=(r=rand(1000))
BenchmarkTools.Trial:
memory estimate:  7.94 KiB
allocs estimate:  1
--------------
minimum time:     746.443 ns (0.00% GC)
median time:      1.241 Ī¼s (0.00% GC)
mean time:        2.408 Ī¼s (48.33% GC)
maximum time:     73.695 Ī¼s (98.12% GC)
--------------
samples:          10000
evals/sample:     122
``````

So, my homemade solution beats the builtin `findall` by a factor of 10 in this special case. (Edit: I moved the `@inbounds` for an even bigger speedup.)

For the record, here is the Base implementation of `findall`:

``````findall(testf::Function, A) = collect(first(p) for p in pairs(A) if testf(last(p)))
``````

See how simple it is, almost trivial. Itās actually amazing.

5 Likes

I am not sure what you consider a ānaiveā loop, but most loops which just traverse something in a memory-friendly order are usually on par with what `Base` does when the performance tips are kept in mind. For the rest, yes, some tricks can be necessary. The majority of the constructs that make `Base` implementations fast are exposed.

I would say that loops are usually one of the fastest solution in Julia, and the only reason not to apply them is code clarity: more abstract code is usually easier to understand.

5 Likes

Thanks! That was very eye opening.

Just to give some examples of what I mean:

`minimum`:

`findall` on a `BitArray`:

I would disagree with that. There are many optimization opportunities over looping in memory order and in quite a few cases Base code goes that extra step to squeeze out a bit more performance.

4 Likes

The `findall(::BitArray)` optimization really shines here compared to the iterator-based `findall`, despite the need to allocate an intermediate `BitArray`:

``````julia> @btime findall(A .> 0.1) setup=(A=rand(1000))
1.833 Ī¼s (4 allocations: 11.30 KiB)

julia> @btime findall(>(0.1), A) setup=(A=rand(1000))
8.533 Ī¼s (14 allocations: 16.55 KiB)
``````

Itās faster for almost every case I tested (the only exception being small arrays where few elements satisfy the condition). The advantage likely boils down to SIMD execution of the comparison, and avoiding the need to repeatedly grow/copy the output array. Itād be great for the iterator-based implementation to specialize on strided numeric arrays to take advantage of SIMD.

1 Like

The iterator-based approach would be sped up significantly by resolving issues #32035 and #32320, but those havenāt been touched since last spring.

1 Like