`findmax` and friends: confusing behaviour to be introduced in 1.7

Is there any other place in julia where arrays and functions are treated equivalently?

To my knowledge, no. And I’ll emphasise that they aren’t here either: we don’t define findmax(f::Function).

The PR lets you define findmax(x) as returning a tuple of a value in the codomain of x and domain of x, and defines a method for indexable collections because the only two reasonable definitions of the domain of an indexable collection are “its keys” and “nothing”'; and if we assume “nothing”, then that’s less useful.

I acknowledge that it’s a little unusual (and we talked about it a bit in the PR), but the rules are simple and well described by the docstrings and the solution we came to in the PR is the only one that I could think of that satisfies all of:

  • makes sense
  • maintains relationships between min, minimum, findmin, argmin for all variants and arguments
  • doesn’t introduce new functions

It’s quite possible that someone will come up with something better that fits the same requirements, or a better solution that loosens one of those requirements. I think there’s still time to revert my PR and possibly land a new one, if so.

The PR lets you define findmax(x) as returning a tuple of a value in the codomain of x and domain of x, and defines a method for indexable collections because the only two reasonable definitions of the domain of an indexable collection are “its keys” and “nothing”’; and if we assume “nothing”, then that’s less useful.

But you cannot define findmax(::Array) since it’s type piracy, right? And undoubtedly, findmax over arrays is a very obvious and frequent usage.

1 Like

Sure, I get that. I prefer the names and implementations as they are in 1.7, but I think it is okay if the balance of opinion is that this is too confusing and so something else has to be done.

I don’t understand why you would want to? The existing definitions of findmax for arrays gives a good definition.

I had expected

findmax(f, xs) 

to be a fast version of

findmax(f.(xs))

and since getting the actual index will be difficult, the latter is what will be used, hurting my eyes.

Not too pleased with the planned design.

12 Likes

To add: I like the idea of extending base functions to work with general iterables, and I am wishing that many packages would follow suit. However, if one takes the view that findmax returns indexes (as it always has), it isn’t clear that it should work for general iterables (which aren’t guaranteed to support getindex). This is more support in favor of making the new behavior live in a different function.

12 Likes

This may be a useful function to have, but it seems completely independent in its underlying meaning from the current findmax(A). Currently, the latter is described as:

Return the maximum element of the collection itr and its index or key.

I strongly believe that the natural generalization - if there is any - should follow lines similar to findfirst (and others):

findfirst(A)
Return the index or key of the first true value in A.

findfirst(predicate::Function, A)
Return the index or key of the first element of A for which predicate returns true.

That is, the second method is also interpreted in terms of a collection and finding its element subject to some function.
It’s a sure bet that many more people familiar with existing julia functions would expect findmax(f, A) (if it exists) to also return an index into A - as findmax(A) does. The domain/codomain interpretation seems kinda niche, more specific to mathematical optimization than to general array operations.

findmax(f, A) == findmax(map(f, A)) ticks all of these bullets as well, doesn’t it?
Additionally, it is consistent with other julia base functions, as pointed out multiple times in this thread.

5 Likes

@ColinCaine I just want to thank you for the PR despite the frustration. It’s a nontrivial amount of unpaid labor in a hopelessly tangled thicket. You mentioned this desiderata:

  • doesn’t introduce new functions

Here’s my opinion: longterm this whole corner of Julia needs to be deprecated, thrown to the bottom of the ocean and replaced with unambiguous function names. For instance: minind, minelt, firstind, firstelt. Then argmin is fine if we have these others (returns whatever went into the function application so f(argmin(f, itr)) = minelt(f, itr)). Otherwise we’ll be ?-ing them every. single. time. The best thing we can do is introduce new unambiguous names now.

4 Likes

From reading the reactions, I feel it’d be reasonable to revert the PR. We’ll gain consistency at the cost of generality. On one hand, with cmcaine’s PR, we have findmax(f, ::Channel) etc. and it still is a generalization of unary findmax because findmax(xs) == findmax(i -> xs[i], eachindex(xs)) (in the sense that one can implemented in terms of the other [1]). We also can access index/key by using findmax(f ∘ last, pairs(xs)). On the other hand, I find that this argument:

is compelling.

See also the new comment in GitHub:

triage thinks 1 arg argmax is correct, but findmin with a function is broken and needs to be fixed before 1.7.

https://github.com/JuliaLang/julia/issues/39203#issuecomment-849898668



  1. Further “generalizing” this discussion, maybe the main lesson here (for me) is that this definition of generalization was too weak for constructing a good API. ↩︎

10 Likes

It’s great to see a more “official” reaction to this!
I personally have trouble understanding what this means specifically:

Like, what does this discussion have to do with 1-arg argmax at all? It’s definition is effectively set in stone until julia 2 anyway.
Anyone can clarify?.. @Oscar_Smith?

I just hope this doesn’t end up with keeping argmax(f, A) as it is defined in nightly, and changing findmax(f, A) == findmax(map(f, A)) at the same time. It would be an improvement, of course, because one of the underlying issues is solved:

However, such change would still keep these functions inconsistent: findmax(A)[2] == argmax(A) but findmax(f, A)[2] !== argmax(f, A), and argmax(identity, A) != argmax(A).

1 Like

As I understand it argmax(f,A) as defined in nightly is just the mathematical definition. Is this wrong? The reason 1 arg argmax was brought up is that since the 2 arg version is mathematically correct, if there is inconsistency, it would be the 1 arg version that’s inconsistent.

The reason argmax(identity, A) != argmax(A) is true is because in cases where you want this behavior from argmax(A) you can just use maximum instead, and it is conceptually really useful in many cases to interpret an Array as a function who’s range is the values, and who’s domain is the indices. One possible solution for Julia 2.0 is to deprecate argmax(A) since it has 2 somewhat contradictory interpretations, and both of them can be written relatively simply in ways that avoid the confusion.

1 Like

In a vacuum, this would clearly be true.
But consistency arguments here in this thread apply to argmax as well : it’s a reduction, and reductions in base julia obey r(identity, A) == r(A) if r(f, A) is defined. I started this topic with findmax just because this is the function I use most often.
I’m not aware of any argmax overloads in optimization packages, which practically shows that conflating these two meanings (array ops and mathematical optimization) doesn’t look natural.

There is a function that returns the index of the maximal element in an array. This function is called argmax in julia. Is it a case of unfortunate naming? Yes. But this is the only meaning currently assigned to argmax.
If argmax(f, A) definition is confusing - and looks like it is, the math and array style definitions conflict - maybe it just shouldn’t be defined at all? MethodError would be much better than confusion like

A = -10:5
ix = argmax(A)
...
A[ix]

# vs

ix = argmax(abs, A)
...
A[ix]  # may even work, but give unexpected results

maximum doesn’t return the index into array, which is the whole point of argmax now.

4 Likes

I second @aplavin 's concerns here. It’s clear that the definition that is proposed now is mathematically correct but it pretty much not in line with everything else that is already in place. As much as I do support the change of definitions, I think that this is material for v2.0. (it breaks not only code (potentially), but also developers assumptions about what these functions do (as evidenced by this thread))

For the mean time: What about introducing a set of functions generalizes_foo where foo is any of the members of the offensing function families (arg..., find..., more?), that is consistent with the definition that is introduced by @ColinCaine 's PR? (or maybe have homonymous functions live inside a submodule like GeneralizedReductions). This would allow us to keep the PR, introduce more correct versions of those functions while being clear that they are different from the usual computer science puns, people could depend on them and Julia v2.0 would simply need a change of names later. ?

(Sidenote: That’s a great discussion to have and a great PR nontheless, thanks y’all)

The problem is isn’t just consistency. It’s the name. Findmax could mean any of the the things proposed and many more. Therefore we should throw away this toxic name. I know it’s hard to swallow that much code breaking but it’s terrible. Look at all this discourse. We’re like college students arguing about Free Will.

Haven’t you read Ursula K. Le Guinn? Find the true names for the functions we need. Deprecate findmax and wait till 2.0. What is this monstrosity doing in base at 1.7?

Sounds like a perfect usecase for a package! It would even allow these functions to have their mathematically-expected names without worrying about conflicts: just call them in a qualified way, Generalized.argmax.

2 Likes

Yes, a lot of these utilities could live perfectly well in a package, except those which are used by Base and the standard libraries.

Incidentally, I also don’t get the reason for all those foo(f, itr) methods when foo(Iterators.map(f, itr)) and similar sould do just as well.

1 Like

Note that you can’t do sum(Iterators.map(abs2, randn(3,4)), dims=1), but you can do sum(abs2, randn(3,4), dims=1). (But this doesn’t apply to findmax and friends.)

I think you can do this with eg

and

I am not sure there is any real reason for named variants of reduce to take function and slicing arguments at this point, they may just be historical remnants.

That’s pretty verbose, and not very discoverable. I don’t think many people would discover or use this.

Having to know about packages like LazyArrays or JuliennedArrays raises the bar even further. Having a terse and trivially discoverable way of doing this is important, imho.

6 Likes

foo(f, itr) also allows you to use the do syntax sugar.

1 Like