How to find the index of the maximum value of an array that meets certain conditions

I’ve got a question involving two parts. I can do each part on its own, but I don’t know the best way to put the two parts together.

I want to find the index of the maximum value in an array, limited to elements of the array that meet certain conditions.

I can find all the indices that meet two conditions:

condition1 = Bool.(round.(rand(100)));
condition2 = Bool.(round.(rand(100)));
conditional_indices = findall(condition1 .&& condition2)

or I can the index of the maximum value in an array:

y = randn(100); 
ymax, ymax_ind = findmax(y)

but how do I find the array index corresponding to the maximum y value that meets condition1 and condition2?

I think you can just do the following:

ymax_ind = conditional_indices[argmax(y[conditional_indices])]
ymax = y[ymax_ind]

(edited to add… check that this works out)

@assert condition1[ymax_ind] && condition2[ymax_ind]
@assert all(ymax .>= y[conditional_indices])

This is a typical example of what I think you want, if not then simply discard it.

julia> Random.seed!(1234);

julia> arr = rand(Int, 100);

julia> cond1 = iseven.(arr);

julia> cond2 = signbit.(arr);

julia> m = maximum(arr[cond1 .&& cond2])
-15074387398957680

julia> findmax(x -> isequal(x, m), arr)[2]
60

julia> arr[60]
-15074387398957680

If you would probably do a lot of these computations it would be best to wrap it inside a function:

julia> function compute(array, cond1::Function, cond2::Function)
           m = maximum(array[cond1.(array) .&& cond2.(array)])
           f = findmax(elem -> isequal(elem, m), array)[2]
           return f
       end
compute (generic function with 1 method)

julia> Random.seed!(1234);

julia> compute(rand(Int, 100), iseven, signbit)
60

Your condition1 and condition2 are arrays, but presumably conditions are originally predicates (functions) of array elements. Then:

julia> using Skipper

# define conditions and array:
julia> cond1 = x -> x > 0
julia> cond2 = isodd
julia> y = rand(Int, 100);

# find ymax, ymaxind:
julia> findmax(skip(x -> !(cond1(x) && cond2(x)), y))
(8915199774974159241, 22)

2 Likes

Why does composing skips take nearly twice as long?

findmax(skip(!cond1, skip(!cond2, y)))

test:

julia> @btime findmax(skip(!x->$cond1(x)&&$cond2(x), $y));
  167.720 ns (0 allocations: 0 bytes)

julia> @btime findmax(skip(!$cond1, skip(!$cond2, $y)));
  327.753 ns (0 allocations: 0 bytes)

Maybe because both do short-circuiting. But they do it in flipped order, and it is order-sensitive. Try:

findmax(skip(!cond2, skip(!cond1, y)))

They’re different, but not earth-shatteringly so:

julia> @btime findmax(skip(!$cond2, skip(!$cond1, $y)));
  293.233 ns (0 allocations: 0 bytes)

No specific reason, just small overheads here and there that weren’t removed by the compiler.
Sprinkled a few inlines and boundschecks — now both variants are faster, and times are close to each other:

# Skipper@0.1.6:
julia> @btime findmax(skip(!$cond1, skip(!$cond2, $y)));
  363.904 ns (0 allocations: 0 bytes)
julia> @btime findmax(skip(!x->$cond1(x)&&$cond2(x), $y));
  181.716 ns (0 allocations: 0 bytes)

# Skipper@0.1.7:
julia> @profview @btime findmax(skip(!$cond1, skip(!$cond2, $y)));
  128.049 ns (0 allocations: 0 bytes)
julia> @btime findmax(skip(!x->$cond1(x)&&$cond2(x), $y));
  118.835 ns (0 allocations: 0 bytes)

Btw, a “positive” version of skip would be cleaner in these cases, but it’s not clear what name the function should have. It’s not filter because the semantics are different…

1 Like

In this comment I probe the idea that maybe this is a mistake.

Namely, the semantics of lazy Iterators.filter, which ignores the original collection’s indices, are based on an understanding of what “filtering” means informed by eager Base.filter, which conflates the idea of filtering with the act of collecting. However, if our starting point had instead been lazy filter, maybe this decision would have gone the other way—maybe the lazy filter would preserve the original collection’s indices.

Although, maybe the name filter is too tainted to go back now. That said, it’s not an especially informative name anyway—it doesn’t specify whether to keep or to discard the values for which the predicate returns true. In electrical engineering, we use the word pass to describe filters (e.g. band-pass filter, low-pass filter, etc.), but I’m finding myself enamored by @jar1’s suggestion of keep.