Extend the domain of maximum and minimum to include empty collection?

No, that is not weird; at least if you are a mathematican!
Of course if itr is not empty, that would be, but not if itr is empty. Reason is, that max(-Inf, a) == a and min(Inf, a) == a for all a real a. You could also say "-Inf is the neutral element of the operator max on real types.
The following sentence should be easily acceptable:
“If you remove a number from a collection, the collection’s maximum should not grow.”

From this you can logically derive, that maximum(empty) == -Inf as you see here:
Let a::T be a value of type T. T should have a total ordering and < properly defined.
If you remove a from Set([a]) you get Set{T}[]. Then, by the accepted condition, maximum(Set{T}[]) <= maximum(Set[a]) == a for every a, from that we conclude maximum(Set{T}) <= a for all a of the Type T. So maximum(Set{T}) must be a lower bound of T. For numerical types the only defined lower bound is -Inf.
The same argument leads to minimum(Set{T}) == +Inf`.

The second question is if maximum(X) ∈ X is necessary. Again, for non-empty totally ordered finite sets X that is true. The general case needs generalization. For example, the set of all rational numbers, whose square is below 2 does not contain its maximum, because sqrt(2) is not rational. And maximum({}) is not contained in {}, because {} does not contain any element.

There is no contradiction between ‘mathematically correct’ and ‘weird’. I do understand your argument, but maximum(itr) < minimum(itr) is seriously weird, and is likely to create some big surprises.

1 Like

That may be true, but why not accept weird results as a part of reality? I say that, as I am a mathematician. In my opinion mathematics lives amongst other things from counter-examples and surprises at the first sight.
So you pledge for “it does not make sense to define maximum for non-empty iterators, because some people could be surprised by the correct result”?

BTW, I was extremely surprised, when my call to maximum(empty) did not return -Inf. So the seriousity level of weirdness seems extremely subjective.

maximum(itr) does explicitly return “the largest element of itr”, so the docstring would’ve helped there.
Also I’d expect the supremum of {} to be -Inf, but not the maximum. But that’s just from what I remember from a couple of years ago :wink:

2 Likes

My 2¢: Just like parse and tryparse are two versions of parsing which throw and not-throw an exception respectively, what we want here is two versions of maximum (resp. minimum).

Luckily, even mathematicians understand the difference and have two definitions for these: maximum and supremum. The maximum would throw on an empty collection and always return an element from the collection and supremum would always return a least upper bound or -Inf (or typemin).

infimum is to minimum like supremum to maximum.

Also we now have DNF’s asserions: maximum(itr) >= minimum(itr) always, and maximum(itr) ∈ collect(itr). Additionally, we have: Post maximum/minimum the collection is known to be non-empty (unless exception thrown), infimum/supremum never throw, and for each x ∈ collect(itr) it is true that x <= supremum(itr).

1 Like

That is cutting the Gordian knot! I did not dare to propose the introduction of two new general functions into Base!
But I would prefer to use infimum and supremum in place of minimum and maximum.
Should we also ask for lowerbound and upperbound for min and max for the case of partially orders types; or just re-use the new function names?

Is an array even a set? Not as far as I can find: “A set is a well-defined collection of distinct objects.” Is that a controversial definition?

infimum and supremum are fine, but should not replace maximum and minimum

But, anyway: is it more reasonable for maximum to throw, or to return a Nullable? Is this the sort of thing Nullables are for?

No, any Array is not a Set, but both are iterables. The argument of maximum is an iterable itr.
One of the examples uses eltype(itr) == Set{T}, which is another topic.

infimum and supremum should not replace minimum and maximum, but they should extend the definition area (allowed argument values); actually only the empty iterator is added. As it seems to be difficult to extend the definition area of maximum or minimum, because of unexpected results in the extension area, the proposal to use the new names, which are mathematically better justified.

Returning a Nullable is not a valid option for me. I would rather use supremum and work with the result (potentially -Inf) instead of having to investigate the state of the nullable object each time.
Imagine max(supremum(A), supremum(B)) for example, and compare with
a1 = maximum(A); a2 = isnull(a1) ? get(a1) : -Inf; b1 = maximum(B); b2 = isnull(a1) ? get(b1) : -Inf; max(a2, b2)

I’m not used to working with Nullables, but I think

max(get(maximum(A), -Inf), get(maximum(B), -Inf))

might have worked.

You could also just define supremum(x) = isempty(x) ? typemin(eltype(x)) : maximum(x) or similar without necessitating any changes to Base.

1 Like

This was in the hypothetical scenario where maximum returned a Nullable. And I was wondering/asking if that would be the ‘correct’ way for maximum to work.

You are right, that would be a less ugly solution - I did not think of the default-value argument of get(::Nullable).

Yes, I could do that; and indeed that is what I did, after I recognized the behaviour of maximum applied to empty iterables. I opened this topic, because I thought it worthwhile to improve the language in a way, which makes such small work-arounds unnecessary.

What inspired me to propose the modification of two the Base.mr_empty methods was the kind of implementation of maximum. It calls mapreduce(identity, Base.scalarmax, itr), which in turn calls mr_empty(identity, scalarmax, eltype(itr)).
So the mechanism is already in place in Base providing return values for empty iterators.

Actually, if the first argument of mapreduce is abs or abs2, the default value is zero of the appropriate type.
There is another method maximum(f::Function, itr), which is implemented in Baseas well, which calls mapreduce(f, max, itr).
So maximum(abs, itr) does exactly what I expect (it delivers the greatest lower bound of all numbers >= 0), when itr is empty.
At the same time maximum(abs.(itr)), which is equivalent to the former, if itr is not empty, throws an exception. So my approach to extend the domain of maximum to empty iterators (actually only one value is added to the domain) seemed to be natural to me. It would not break existing code, which does not rely on the exception to be thrown.
(And I cannot image a programmer who relies on the exception in order to figure out, if his iterable is empty.)

1 Like

Let me give a use case:
There is a collection of classroom-courses with pupils of different grades. In order to elect the best-of-school pupil, the following procedure is defined: From each course select the pupil with the best marks. Add a bias according to the grade and select pupil with best sum of marks and bias. We assume better marks are represented by floating point numbers int he range between -1 and 1, the bias values between -0.2 and 0.2 only depend on the grade of the course and are based on statistics of the school principal.

Implementation:

struct Pupil
           marks::Float64
           gender::Symbol
end
struct Course
           bias::Float64
           pupils::Vector{Pupil}
 end

courses = [Course(0.2, [Pupil(0.7, :male), Pupil(0.8,:female)]), Course(-0.2, [Pupil(0.8, :male), Pupil(0.7,:male)])];
marks(p::Pupil) = p.marks
bestofclass(c::Course, gen::Symbol) = maximum(marks.(filter(p->gen == :all || p.gender == gen, c.pupils)))
bestofschool(cs::Vector{Course}, gen = :all) = maximum(c->bestofclass(c, gen) + c.bias, cs)

julia> bestofschool(courses)
1.0

julia> bestofschool(courses, :male)
0.8999999999999999


julia> bestofschool(courses, :female)
ERROR: ArgumentError: reducing over an empty collection is not allowed
Stacktrace:
 [1] _mapreduce(::Base.#identity, ::Base.#scalarmax, ::IndexLinear, ::Array{Float64,1}) at ./reduce.jl:265
 [2] bestofclass(::Course, ::Symbol) at ./REPL[9]:1
 [3] _mapreduce(::##7#8{Symbol}, ::Base.#scalarmax, ::IndexLinear, ::Array{Course,1}) at ./reduce.jl:273
 [4] bestofschool(::Array{Course,1}, ::Symbol) at ./REPL[10]:1
 [5] macro expansion at ./REPL.jl:97 [inlined]
 [6] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

The last run fails, because there are no female pupils in one of the courses.
See the results after with modified function:

julia> bestofschool(courses)
1.0
julia> bestofschool(courses, :male)
0.8999999999999999
julia> bestofschool(courses, :female)
1.0

There seems some resistance and some technical difficulties to extend the domain of maximum or minimum to include empty collections.
Conforming to a proposal

and

I would propose to

  • document the current behaviour of maximum and minimum and leave it as it is.
  • implement new functions supremum and infimum, which behave equivalent for non-empty collections of totally ordered types, but cover the other cases as well.

Proposal for the extension of the documentation for maximum:

help?> maximum
  maximum(itr)

  Returns the largest element in a `non-empty` collection.

  If itr is empty, throw exception of type ArgumentError.

  julia> maximum(-20.5:10)
  9.5
  julia> maximum([1,2,3])
  3
  `julia> maximum(UInt[])
  ERROR: ArgumentError: reducing over an empty collection is not allowed
`
julia> 

  maximum(A, dims)

  Compute the maximum value of an array over the given dimensions. See also the max(a,b) function
  to take the maximum of two or more arguments, which can be applied elementwise to arrays via
  max.(a,b).

...

 `maximum(f::Callable, a)`

  Returns the largest value of f.(a) or zero for a collection a.

  The collection must generally not be empty, with few exceptions.

  If a is not empty, return the largest value of f.(a). That is equal to maximum(f.(a)) but
  execution is more efficient.
  If a is empty and f is one of Base.abs or Base.abs2, return the zero value of the element type
  of a.

  If a is empty or has no element type or no zero of the element type is available, throw
  exception.

  julia> maximum(sin, [2pi, pi/2])
  1.0
  
  julia> maximum(abs2, Float64[])
  0.0
  
  julia> maximum(abs2, UInt[])
  0x0000000000000000
  
  julia> maximum(x->x^2, UInt[])
  ERROR: ArgumentError: reducing over an empty collection is not allowed
  
  julia> maximum(sin, Float64[])
  ERROR: ArgumentError: reducing over an empty collection is not allowed
  
  julia> maximum(abs, x for x in 1:0)
  ERROR: ArgumentError: reducing over an empty collection is not allowed
  
  julia> struct Bogus end
  julia> Base.zero(::Type{Bogus}) = 0.0
  julia> maximum(abs, Bogus[])
  ERROR: TypeError: _mapreduce: in typeassert, expected Bogus, got Float64
  
  julia> struct Bogus2 end
  julia> maximum(abs, Bogus2[])
  ERROR: MethodError: no method matching zero(::Type{Bogus2})


So, has there been an official decision made on this?

I’m generally in agreement with @klacru, and was surprised when maximum and minimum of empty AbstractFloat vectors didn’t return -Inf and Inf (resp.).

Without an issue or a PR, most questions like this just don’t get an “official” decision.

I guess that minimum and maximum are not defined for empty collections could be mentioned in their docstings, but the error message is already informative.

1 Like