I would like to find indices of a 1D array wich verify a given condition. I think that the way to go is to use the function findall. However I have found this tricky way to be actually faster and has way less memory allocations. Any thought on that?
My guess is that findall is optimized under the assumption that few entries match. As a result, it appends to create a list of solutions, which is due and allocates more if everything matches. In that case, yours will allocate an array of size le, and will be much slower.
would increase performance in your case quite a bit, although one would still have to benchmark, how this performance differs for different array sizes and percentages of matched keys. Definitely worth investigating further.
Hereās an implementation (based on filter) thatās faster for your example:
function findall2(f, a::Array{T, N}) where {T, N}
j = 1
b = Vector{Int}(undef, length(a))
@inbounds for i in eachindex(a)
b[j] = i
j = ifelse(f(a[i]), j+1, j)
end
resize!(b, j-1)
sizehint!(b, length(b))
return b
end
And hereās the benchmark timings. findall2 is about 8x faster than findall for this example.
function findall3(f, a::Array{T, N}) where {T, N}
j = 1
b = Vector{Int}(undef, length(a))
@inbounds for i in eachindex(a)
@inbounds if f(a[i])
b[j] = i
j += 1
end
end
resize!(b, j-1)
sizehint!(b, length(b))
return b
end
findall3 is consistenly faster than findall and findall2 except for i = 0 at l = 10000 and l = 100000, where findall2 is faster (I donāt understand why).
I believe that findall & friends are best for use cases which are relatively āsparseā, ie the number of things to be found is small relative to the total. This is of course a very vague definition, but in practice the issue is the same as for sparse matrices (ie when it becomes better to treat it as a dense matrix).
I find it interesting to read the elegant solutions above, but IMO using and optimizing findall for this use case is not the right solution. As @mleprovost observed in the original post, for a ādenseā problem broadcasting may be better, and this is fine.
Note that findall3 is consistently faster than findall both for sparse problems (approx 0.0005 hits for finding normally distributed values above 3.3) and dense problems (approx 0.9995 hits for normally distributed values above -3.3).
Indeed findall3 is a very nice implementation (and should work for all iterables, so I guess you could remove the type restriction). This is at the cost of allocating more (much more for long sequences), but that is not so costly to work against it.