`findmax` and friends: confusing behaviour to be introduced in 1.7

It seems like a lot of standard libraries functions and Base functions could be moved into packages as well. That would be a nice outcome for 2.0 imho.

2 Likes

A possible solution is to lower

f(g.(x))

to something like

applybroadcasted(f, broadcasted(g, x))

with the default implementation

applybroadcasted(f, args...; kwargs...) = f(map(materialize, args)...; kwargs...)

Things like sum(f.(xs)) will be efficient if it implements applybroadcasted(::typeof(sum), ..).

The downside of this approach is that

sum(f.(xs))

and

mysum(x) = sum(x)
mysum(f.(xs))

are not equivalent, performance-wise.

So, it’s probably better to just have a syntax for the non-materializing broadcast broadcasted(f, x) (which is already discussed in GitHub) to avoid such a performance cliff.

1 Like

While I also find the current behavior in 1.7 strange, and tend to think that the equivalence between findmax(identity, itr) and findmax(itr) is indeed desirable, it’s also true that the behavior expected in the OP can be obtained via

findmax(x -> f(x[2]), enumerate(itr))

It’s not apparent how to do the inverse (express the behavior currently in 1.7 in terms of the one expected in the OP). So it seems at least that the behavior in 1.7 is not less powerful than the one desired.

But then again, I agree the API would be much more consistent with the proposal in the OP.

So overall :man_shrugging:t2:

Edit:

@aplavin that’s not really true in the context of finding values, is it? If you can find a value satisfying certain criteria, you can also obtain its index at the same time for arbitrary iterable collections, via enumerate (see my snippet above).

Conversely, knowing the index of something in a general iterable collection doesn’t mean that you can get to it efficiently (in case iterating the sequence is expensive)

Sure, my phrase

is not true in the strict sense. Julia is a language general enough to implement any findmax-like function, so it is possible even without stdlib.

However, if one follows the logic in your comment, then findmax and argmax functions are not needed at all. They can trivially be replicated with findmax(A) == maximum(reverse, pairs(A)).

Btw, you used enumerate instead of pairs in your example, which is an additional confirmation for this being not really intuitive. It’s easy to make such mistake and get wrong results with arrays and other collections.

1 Like

I actually intended to use enumerate :smile:

enumerate always starts at 1, regardless of the collection, so it’ll give wrong answers here sometimes. E.g.

julia> using OffsetArrays

julia> v = OffsetArray([:a, :b, :c, :d, :e], -2:2)
5-element OffsetArray(::Vector{Symbol}, -2:2) with eltype Symbol with indices -2:2:
 :a
 :b
 :c
 :d
 :e

julia> collect(enumerate(v))
5-element Vector{Tuple{Int64, Symbol}}:
 (1, :a)
 (2, :b)
 (3, :c)
 (4, :d)
 (5, :e)

julia> collect(pairs(v))
5-element OffsetArray(::Vector{Pair{Int64, Symbol}}, -2:2) with eltype Pair{Int64, Symbol} with indices -2:2:
 -2 => :a
 -1 => :b
  0 => :c
  1 => :d
  2 => :e
7 Likes

One thing is that enumerate starts at 1, but also it does not preserve CartesianIndices for higher-dimensional arrays. So you should definitely use pairs, not enumerate.

4 Likes

Thinking about this a bit more: I think that keeping the function in the front is beneficial. I this API is ever redesigned, I would just have

findmax(f, itr)

which would return the element x of itr that maximizes f(x). This would allow things like

i, x = findmax(last, pairs(vec))

and similar. Don’t need the index? Just use vec directly, etc.

findmax(f, itr)

which would return the element x of itr that maximizes f(x) .

Is this not already the semantics of maximum(f,itr)?

Nope, maximum returns f(x), not x.

But when the function itself is expensive, the value could also be useful, so

(i, x), gx = findmax(g ∘ last, pairs(xs)) 

may just cover all use cases (except for dims, which is orthogonal).

I any case, I would have to spend the findmax(f, x) syntax on what is currently in master, it is much too valuable for other stuff (that needs to be figured out). So, in the meantime, I think that PR should be reverted.

2 Likes

Increasing verbosity like the above makes it closer and closer to just using maximum:
gx, x, ix = maximum((ix, x) -> (g(x), x, ix), pairs(xs)).

Why not add a keyword argument by=f to such functions (if redesigning from scratch)? It becomes crystal clear that maximum(A, by=f) returns the same “kind” of value as maximum(A) does - that is, an element of A - but orders values by f(a) instead of a.
Same for idxmax(A) - imaginary function to return index of maximum, idxmax(A, by=f) - index of element in A with maximal f(a).
I think this covers common cases and reads unambiguously.

5 Likes

An important advantage of a positional argument is that you can dispatch on it, so you can eg make certain special cases fast.

That said, I am not arguing for this design here, only for keeping the options open by reverting #35316 as you suggested. Then, ideally, I think that these design issues should be hammered out in a package.

4 Likes

I’m not that familiar with specifics of how dispatch works internally. I thought there are no fundamental issues with dispatching on a by-like argument:

maximum(A, by=identity) = _maximum_impl(A, by)

_maximum_impl(A, ::typeof(identity)) = ...
_maximum_impl(A, ::typeof(...)) = ...

What are issues with such an approach?

Sure, that’s what I also agree with (:

Having to expose the internal _maximum_impl as part of your API. Feasible, but icky.

I’ve just noticed that the original GH issue Unintuitive findmin and findmax · Issue #39203 · JuliaLang/julia · GitHub is closed.
The current fix makes findmax(f, A) == findmax(map(f, A)) (as expected), but argmax(f, A) == A[argmax(map(f, A))] (i.e. no change from previous 1.7 nightlies).
That’s especially weird considering that the GH issue is primarily about argmax (not findmax) being unintuitive. The issue was renamed to …findmax… just before this fix, basically.
As I said already in this thread, the triage comment

triage thinks 1 arg argmax is correct, but findmin with a function is broken and needs to be fixed before 1.7.

also looks strange in the context. Of course, 1-arg argmax/findmax/… are to stay for all julia 1.x versions, and they were not the object of discussion here.

Basically all consistency arguments here and in that GH discussion apply the same to both findmax(f, A) and argmax(f, A), there is no difference. So, the mathematical-optimization argmax notation clearly conflicts with the current argmax meaning in julia.
Having that in mind, why not just keep argmax(f, A) undefined as it always was? I argue this would be strictly better than the very confusing behaviour:

val, ix = findmax(A)
ix = argmax(A)

val, ix = findmax(f, A)
ix = argmax(f, A)  # No! Can easily lead to errors that are hard to find.

This confusion results from conflating two independent and different meanings into the same function.

3 Likes

I posted a comment on github Unintuitive findmin and findmax · Issue #39203 · JuliaLang/julia · GitHub as another attempt to raise this issue and actually get it fixed.
Quoting myself from there (partly overlaps with previous comment here):

This issue is marked “closed”, but is fixed only partially (by @jeff.bezanson’s commit): for findmax + findmin , not for argmax + argmin .
The triage comment by @Oscar_Smith does not address the argmax(f, A) issue at all.
It would be really unfortunate if Julia is stuck with this error-prone behaviour “forever”:

val, ix = findmax(A)
ix = argmax(A)

val, ix = findmax(f, A)
ix = argmax(f, A)  # No! Can easily lead to errors that are hard to find.

Why introduce argmax(f, A) at all if there is a conflict between two potential meanings? Currently this is a simple function argmax(A) to find index of the maximum. There are lots of users already surprised and confused by argmax(f, A) being defined but not equivalent to argmax(f.(A)) : see this thread, https://discourse.julialang.org/t/findmax-and-friends-confusing-behaviour-to-be-introduced-in-1-7, and there were multiple questions on slack.

But looks like Julia will indeed be stuck with this…

4 Likes

I think findmax(f, A) and argmax(f, A) are now fine and reasonable. The problem is argmax(A), which, imho, should not exist. It doesn’t make any sense, as argmax is (well, should be) an operation on f, not on A, and as such, a default value for f such as identity or anything else seems wrong.

argmax(A) should be removed and replaced with something with a better name (in the long term, of course.)

I generally agree with you!
But argmax(A) and this definition of argmax(f, A) should never coexist in the language together. So, either argmax(A) should be deprecated at the very same time argmax(f, A) is introduced (+a prominent notice added for those who are already used to current argmax), or argmax(f, A) shouldn’t be added at all.

3 Likes

Thanks to JuliaHub, it turned out to be easy to search for argmax definitions across the whole ecosystem. And there are already two packages with argmax(f, A) == findmax(f, A)[2]! This corresponds to argmax(f.(A)) and contradicts the planned 1.7 change.
These packages are the widely-used SentinelArrays (SentinelArrays.jl/chainedvector.jl at v1.3.3 · JuliaData/SentinelArrays.jl · GitHub), and less-widely-used PolynomialRings (PolynomialRings.jl/MonomialOrderings.jl at v0.7.3 · tkluck/PolynomialRings.jl · GitHub).
Meanwhile, there are literally no packages (excluding Compat.jl of course) that make argmax(f, A) to mean what it is proposed to mean in 1.7.

Also, there are no usages of argmax in base outside of tests.

I believe these are really strong arguments to keep the planned argmax(f, A) definition out of Base.

7 Likes

Wow, there is even more “prior art” on this topic! Other julia code examples that are not registered, from first few pages of github search:

Here I only show packages, didn’t really look into less-organized code. All these references assume the argmax(f, A) == argmax(f.(A)) meaning.

2 Likes