Maximum by element in array

Hey,
I’m new to the wonderful language Julia, coming from JavaScript and Ruby.
I have the following code:

list = []
for x = -2:2, y = -2:2
  push!(list, (x, y, x * y))
end

Now I want to find the element/tuple that has the highest third value. In Ruby I would use max_by. Is there something similar that would return (2, 2, 4)?
In fact maximum(list) (Base.maximum) returns that but I can’t figure out by which criteria. Is it the sum of the tuple?
Thank you so much!

In default tuple comparison, the first element is most significant, and end-of-tuple is smaller than all possible elements.

Unfortunately maximum does not accept a by-keyword. You can do e.g. the following (2 lines instead of one)

julia> r=[(1,2,3), (0,5,6), (1,2,4)]
julia> reduce(r) do x,y
       x[3]>y[3] ? x : y end
(0, 5, 6)

Note that this has different corner-case semantics than the built-in maximum (how are NaN handled? What about missing? In case of elements that compare equal, which one is returned?).

2 Likes
2 Likes

You could use the ability to transform each element to make this work (and then transform it back again afterwards):

julia> reverse(maximum(reverse, list))
(2, 2, 4)

Of course, this only works for this one particular example because there exists this sort of reversible transformation that sorts how you want.

Even though maximum has no by keyword, partialsort (used to sort a few entries, for example partialsort(t, 1:3) would find the three smallest entries) does, so you could do:

partialsort(list, 1, by = t -> t[3], rev = true)

as partialsort(t, 1, rev = true) is the same as maximum, but I agree it’d be nicer to just be able to do maximum(list, by = t -> t[3]).

2 Likes

Since you can do

julia> maximum(last, list)
4

is was going to say, just use

(_, ind) = findmax(last, list)
list[ind]

or

ind = argmax(last, list)
list[ind]

I was very surprised to find that findmax and argmax do not support a function as first input. Is this a deliberate choice, or is it just not implemented?

3 Likes

Just not implemented; if you read the issue I linked, there is general support for this kind of feature. Someone just has to do it.

2 Likes

Thanks so much for so many suggestions! Such an active and friendly community!

I have two remaining questions:

  1. @piever, you said that partialsort is just for few entries. Is it because of performance? I’ll be having an array with over a thousand entries. Are the other suggestions better then?

  2. Two suggestions used maximum(~method~, list). The documention says the first argument of two is an A::AbstractArray. How is this method applied to the list a parameter of maximum? Or how is this concept called so that I could look that up?

You can list all methods of the maximum function like so:

julia> methods(maximum)
# 11 methods for generic function "maximum":
[1] maximum(s::BitSet) in Base at bitset.jl:417
[2] maximum(r::AbstractUnitRange) in Base at range.jl:572
[3] maximum(r::AbstractRange) in Base at range.jl:574
[4] maximum(B::BitArray) in Base at bitarray.jl:1650
[5] maximum(x::SparseArrays.AbstractSparseArray{T,Ti,1} where Ti) where T<:Real in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1307
[6] maximum(a::AbstractArray; dims) in Base at reducedim.jl:648
[7] maximum(::typeof(abs), x::SparseArrays.AbstractSparseArray{Tv,Ti,1} where Ti where Tv) in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1329
[8] maximum(::typeof(abs2), x::SparseArrays.AbstractSparseArray{Tv,Ti,1} where Ti where Tv) in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1329
[9] maximum(a) in Base at reduce.jl:487
[10] maximum(f, a::AbstractArray; dims) in Base at reducedim.jl:649
[11] maximum(f, a) in Base at reduce.jl:470

This might help you discover specific methods which you didn’t know about (in this case, methods 10 & 11 are what you’re looking for). Now, if you somehow get a piece of code that works (such as mbauman's answer above), and would like to understand which method it uses, then:

julia> @which maximum(reverse, list)
maximum(f, a::AbstractArray) in Base at reducedim.jl:649

So this tells you that in this method, the second argument is the array; the first argument is not restricted to any type (but in practice should be callable). This is to allow for the special do syntax, which always acts on the first argument of the method.

Does that help?

1 Like

Note that a simple and fast option is always just to write a loop. There is nothing magical about standard-library functions.

4 Likes

I simply meant that, while partialsort(v, n) should give the same result as sort(v)[n], it is generally faster because you don’t need to sort the whole array. Anyway, if the number of elements is in the thousands, all sensible solutions should take almost no time (I would imagine less than a millisecond).

1 Like

Maybe this is just an issue with your example, but if you care about performance at all, you should not use an untyped array (list):

julia> arr = [(x, y, x*y) for x in -2:2 for y in -2:2]

julia> @btime maximum(last, $list)
  1.480 μs (0 allocations: 0 bytes)

julia> @btime maximum(last, $arr)
  17.783 ns (0 allocations: 0 bytes)
2 Likes

Oh, wow. That is really helpful! Thank you so much!