Determine whether an element is in an array

jisutich · February 3, 2024, 4:44am

Hi

I am thinking how to achieve similar thing to numpy’s where function in Julia. Suppose I have an array A and now I want to determine if there is any element of A belong to the regime [a-1e-12,a+1e-12]. If yes I want to also get the position. Thanks

mkitti · February 3, 2024, 4:55am

numpy.where

Return elements chosen from x or y depending on condition.

I am unclear if you want get all elements in the array meeting the condition or not. filter will do that.

If you just want to know if there is any element of A that meets the condition, then any is what you want.

If you want to get the position, then you findfirst or findall.

julia> A = rand(100);

julia> a = A[50]
0.1703964738775896

julia> filter(x -> a-1e-12 <= x <= a+1e-12, A)
1-element Vector{Float64}:
 0.1703964738775896

julia> any(x -> a-1e-12 <= x <= a+1e-12, A)
true

julia> findfirst(x -> a-1e-12 <= x <= a+1e-12, A) # thanks abraemer
50

Also, you want to consider isapprox or ≈.

julia> findall(≈(A[50]), A)
1-element Vector{Int64}:
 50

abraemer · February 3, 2024, 4:56am

I am not totally sure what the a is in your condition but you should be able to use something like findfirst. That function takes a function as first argument and an array/iterable as second and gives you the first index such that the function return true for the value in this position. E.g.

A = rand(100)
findfirst(x -> x>0.9, A)

Gives the index of the first element that is larger than 0.9.

There’s also findall if you want the locations of all elements fulfilling the condition.

bertschi · February 3, 2024, 9:11am

It looks like numpy.where is actually two functions, based on whether you use it with three arguments or a single one. From the docs:

numpy.where(condition, [x, y, ]/ )

Return elements chosen from x or y depending on condition.

Note: When only condition is provided, this function is a shorthand for np.asarray(condition).nonzero(). …

In Julia the analog to the three argument version would be (broadcasted) ifelse, i.e.:

x = rand(10)
ifelse.(x .> 0.5, 2 .* x, - 0.5)  # np.where(x > 0.5, 2 * x, - 0.5)

The one-argument version behaves like findall as explained already.

Benny · February 3, 2024, 5:23pm

Your description actually covers multiple NumPy functions. Given scalar a and array A:

you would do 2 elementwise comparisons b = (a-1e-12) <= A <= (a+1e-12) (if A is not a NumPy array, you would need numpy.less_equal for automatic array conversion). The Julia equivalent is to broadcast the 2 scalar comparisons: b = (a-1e-12) .<= A .<= (a+1e-12).
To find if any of the comparisons computed true, you do numpy.any(b) to reduce logical-or over all the elements. Julia equivalent is any(b), though it instead stops on the first true. (If you want to replicate numpy.any along dimensions specified in the axis argument, I think you’ll need to use mapslices(any, ...) or any.(eachslice(...)), I haven’t tested examples).
To get the indices where comparisons computed true, you use the 1-argument numpy.where(b). The Julia equivalent is the 1-argument findall(b).

The 2-argument findall is for lazily testing elements of an array that isn’t necessarily containing Bool; this saves an allocation of a Bool container. The 3-argument numpy.where does something much different from the 1-argument version, and bertschi provided the right equivalent: broadcasted ifelse. The advantage of Julia’s broadcasting is that when you have a tree of dotted function calls like bertschi’s example, they fuse into one kernel function that is broadcasted once over the input containers, no intermediate allocating broadcasts. Compare that with NumPy where you need to allocate x > 0.5 then 2*x before input to numpy.where. You can check Meta.@lower <insert broadcasting code here> to see this fusion; there should be a series of broadcasted calls setting up the tree followed by only 1 materialize call performing the broadcast loop.

Bear in mind that NumPy’s automatic conversion of array-like inputs to NumPy arrays prior to the primary computation means there’s no full Julia equivalent because Julia instead is implemented to work on many input types and return appropriate output types. If you need to stick to a particular type like Array, then you have to be more careful about your input types, possibly even manually convert them; Julia’s collect would be the equivalent of NumPy’s numpy.array. I can’t recommend an equivalent function to numpy.asarray that avoids a copy when no conversion is needed for an input NumPy array; convert(Array, x) does not work on as many input types as collect does, so I’d probably go with if x isa Array x else collect(x) end.