`findmax` and friends: confusing behaviour to be introduced in 1.7

Several people, including myself, have tried to raise this issue during last months. There were multiple posts on slack, and even a GH issue (https://github.com/JuliaLang/julia/issues/39203). I’ll try to raise it once more, here on discourse, hoping that it gets more traction.

The findmax(A) function, which everyone is used to, returns the maximum value in the array A and its index. However, Julia 1.7 will introduce a findmax(f, A) version, that doesn’t return any index at all:

julia> A = [5, 1, 3, 8, 0];

julia> maxval, maxix = findmax(A)
(8, 4)
julia> A[maxix]
8

# but:
julia> findmax(identity, A)
(8, 8)

There are at least two independent reasons why this feels really weird, confusing, and even surprising (as evidenced by multiple questions on slack).

First: we still have no effecient way to find array index of the maximal value wrt some function! That is, something like (imaginary) findmax(["a", "bc", "d"], by=length). For arrays and many other collections, obtaining index is a more fundamental operation: easy to go from index to value, hard to go the other way.

Second: the findmax(f, A) form, to be introduced in 1.7, is completely inconsistent with both findmax(A) and with other reductions that take a function and an array:

julia> findfirst([false, false, true])
3
julia> findfirst(identity, [false, false, true])
3  # same as above

# but:
julia> findmax([false, false, true])
(true, 3)
julia> findmax(identity, [false, false, true])
(true, true)  # definitely not the same

Even the docstrings for these two forms have literally nothing in common:

findmax(A): Return the maximal element of the collection itr and its index or key.
findmax(f, A): Returns a pair of a value in the codomain (outputs of f) and the corresponding value in the domain (inputs to f) such that f(x) is maximised.

Currently, any (?) reduction of the form r(f, A) in base julia is just a more efficient form of r(map(f, A)).

Several different solutions were suggested in the linked GH issue.
I believe the findmax(f, A) form to-appear-in-1.7 should be removed anyway while it’s not released, and one of those suggested approaches implemented instead, either in 1.7 or later.

NB: discussion above applies to the whole findmax, findmin, argmax, argmin function family.

As I understand PR https://github.com/JuliaLang/julia/pull/35316, the motivation for the soon-to-be-released findmax(f, A) form is based on the mathematical interpretation of argmax(f, A) == argmax f(x) over x in A. This is, however, not the meaning currently attached to this function family in julia. No optimization package overrides argmax to perform maximization of f over some domain, like argmax(f, 0..Inf). And findmax(f, A) doesn’t even have this similarity to any mathematical notation.

35 Likes

Hmm…I get the new findmax and I get the mathematical reasons so I dont really see a problem with it. Multiple dispatch means you can share a name.

The array version can be replicated using the function version using (maxval, maxind) = findmax(i->A[i], 1:length(A)) The identity function isn’t the right function to use to compare.

What are mathematical reasons for the 1.7 findmax(f, A) again? As I said,

findmax(f, A) doesn’t even have this similarity to any mathematical notation.

Technically, of course, unrelated functions can share a name. But it would be weird to override e.g. Base.length(x::Interval) = x.right - x.left: original length function has a completely different meaning. Same with current findmax(A) vs 1.7 findmax(f, A).

But it is the right function for all other reductions in base julia. Why make an exception here?

2 Likes

This is only equivalent to findmax(A) for 1-based 1d arrays, btw. findmax(i->A[i], keys(A)) would be more general, but inefficient for dicts.

I have to admit that I also find this a bit confusing. I do not doubt that in some cases findmax(f, itr) is useful, but I am wondering if it should be renamed something else. This looks like a pun on findmax & friends.

I am wondering if a package would be the best place for further extensions and experimentation along similar lines.

21 Likes

The issue is that in very many places in Julia we have

f(identity, collection)

is the same as

f(collection)

(actually the second method often calls the first)

This is something users are very used to and breaking this mental pattern is problematic I think. Just some quick examples cut-out from the Julia Base:

reduce(op, A::AbstractArray; kw...) = mapreduce(identity, op, A; kw...)
count(A::AbstractArrayOrBroadcasted; dims=:, init=0) = count(identity, A; dims, init)
_all(a, ::Colon) = _all(identity, a, :)
_any(a, ::Colon) = _any(identity, a, :)
27 Likes

Some more confusing behaviour:

julia> findmin(Dict(2 => 3, 4 => 5))   # added after the discussion below
(3, 2)                              

julia> findmin(identity, Dict(2 => 3, 4 => 5))                   
(2 => 3, 2 => 3)                                                 
                                                                 
julia> findmin(sqrt, Dict(2 => 3, 4 => 5))                       
ERROR: MethodError: no method matching sqrt(::Pair{Int64, Int64})
Closest candidates are: ...

@cmcaine @ColinCaine is that intended (sorry, don’t know which one is correct)? I would have expected this to iterate over the keys of the dict…

One alternative might be to have a keyword argument, say index or return_index, defaulting to true (i.e. the behavior of returning an index, not a value) - and throwing for non-indexable iterables. Then using findmax and friends for non-indexable iterables would require explicitly setting return_index = false.

That would be type unstable, I presume.

5 Likes

findmax() without index is just maximum().

1 Like

This is technically true but (I believe) most likely wouldn’t show up in real code due to constant propagation (you would always assign return_index literally), see e.g.:

julia> f(x; kw) = kw ? 1+x : 1.0+x
f (generic function with 2 methods)

julia> g(x) = f(x; kw=true)
g (generic function with 2 methods)

julia> h(x) = f(x; kw=false)
h (generic function with 1 method)

julia> @code_warntype g(1)
Variables
  #self#::Core.Const(g)
  x::Int64

Body::Int64
1 ─ %1 = (:kw,)::Core.Const((:kw,))
│   %2 = Core.apply_type(Core.NamedTuple, %1)::Core.Const(NamedTuple{(:kw,), T} where T<:Tuple)
│   %3 = Core.tuple(true)::Core.Const((true,))
│   %4 = (%2)(%3)::Core.Const((kw = true,))
│   %5 = Core.kwfunc(Main.f)::Core.Const(var"#f##kw"())
│   %6 = (%5)(%4, Main.f, x)::Int64
└──      return %6

julia> @code_warntype h(1)
Variables
  #self#::Core.Const(h)
  x::Int64

Body::Float64
1 ─ %1 = (:kw,)::Core.Const((:kw,))
│   %2 = Core.apply_type(Core.NamedTuple, %1)::Core.Const(NamedTuple{(:kw,), T} where T<:Tuple)
│   %3 = Core.tuple(false)::Core.Const((false,))
│   %4 = (%2)(%3)::Core.Const((kw = false,))
│   %5 = Core.kwfunc(Main.f)::Core.Const(var"#f##kw"())
│   %6 = (%5)(%4, Main.f, x)::Float64
└──      return %6

julia> @code_warntype h(1.0)
Variables
  #self#::Core.Const(h)
  x::Float64

Body::Float64
1 ─ %1 = (:kw,)::Core.Const((:kw,))
│   %2 = Core.apply_type(Core.NamedTuple, %1)::Core.Const(NamedTuple{(:kw,), T} where T<:Tuple)
│   %3 = Core.tuple(false)::Core.Const((false,))
│   %4 = (%2)(%3)::Core.Const((kw = false,))
│   %5 = Core.kwfunc(Main.f)::Core.Const(var"#f##kw"())
│   %6 = (%5)(%4, Main.f, x)::Float64
└──      return %6

There’s some discussion of relying more upon this kind of constant propagation e.g. for qr (but it seems it might require inlining or @aggressive_constprop to be robust?).

1 Like

Maybe I wasn’t clear: the idea would be to control whether the second element of the returned tuple contains the value x[i] in the array x under which f is optimized (return_index = false) or the index i (return_index = true). The first element of the returned tuple would of course still return f(x[i]) (i.e., it would not just be maximum…).

Might it be less confusing if findmax(f, A) returned 3 things, not two?

julia> v = [2.0, 3.0, -4.0];

julia> findmax(v)
(3.0, 2)

julia> findmax(abs2, v)  # view as  abs2 <| getindex(v,_) <| eachindex(v)
(16.0, -4.0, 3)  # suggestion
7 Likes

But it again breaks expect behaviour findmax(v) == findmax(identity, v)

It is only my opinion, but it really looks like it’s better to introduce another function name for current version of findmax(f, v) instead of tormenting it and users with additional keywords and unexpect return types.

10 Likes

@mcabbott , this was also my suggestion in the GitHub issue.

Oh right, I see your suggestion now.

I agree identity isn’t neutral, but at least it’s much more obvious that something different has happened.

This seems almost as error-prone as the current nightly version of findmax. I.e.:

val, ix = findmax(A)  # fine

val, ix = findmax(f, A)  # no error, but wrong result!
# julia will just swallow the remaining 3rd value
2 Likes

You could return them in the order (value, index, domain) instead, in which case

val, ix = findmax(A)

and

val, ix = findmax(identity, A)

would both produce the same result.

6 Likes

This looks reasonable - to me at least (:

My main suggestion for now is to drop the current nightly findmax(f, A) method, and decide on the proper interface for this functionality after that without any rush. Note that it doesn’t even fundamentally require a positional argument overload and could work the same way as sort does: sort(A, by=f).

3 Likes