Hey,
I’m new to the wonderful language Julia, coming from JavaScript and Ruby.
I have the following code:
list = []
for x = -2:2, y = -2:2
push!(list, (x, y, x * y))
end
Now I want to find the element/tuple that has the highest third value. In Ruby I would use max_by. Is there something similar that would return (2, 2, 4)?
In fact maximum(list) (Base.maximum) returns that but I can’t figure out by which criteria. Is it the sum of the tuple?
Thank you so much!
In default tuple comparison, the first element is most significant, and end-of-tuple is smaller than all possible elements.
Unfortunately maximum does not accept a by-keyword. You can do e.g. the following (2 lines instead of one)
julia> r=[(1,2,3), (0,5,6), (1,2,4)]
julia> reduce(r) do x,y
x[3]>y[3] ? x : y end
(0, 5, 6)
Note that this has different corner-case semantics than the built-in maximum (how are NaN handled? What about missing? In case of elements that compare equal, which one is returned?).
Even though maximum has no by keyword, partialsort (used to sort a few entries, for example partialsort(t, 1:3) would find the three smallest entries) does, so you could do:
partialsort(list, 1, by = t -> t[3], rev = true)
as partialsort(t, 1, rev = true) is the same as maximum, but I agree it’d be nicer to just be able to do maximum(list, by = t -> t[3]).
I was very surprised to find that findmax and argmax do not support a function as first input. Is this a deliberate choice, or is it just not implemented?
Thanks so much for so many suggestions! Such an active and friendly community!
I have two remaining questions:
@piever, you said that partialsort is just for few entries. Is it because of performance? I’ll be having an array with over a thousand entries. Are the other suggestions better then?
Two suggestions used maximum(~method~, list). The documention says the first argument of two is an A::AbstractArray. How is this method applied to the list a parameter of maximum? Or how is this concept called so that I could look that up?
You can list all methods of the maximum function like so:
julia> methods(maximum)
# 11 methods for generic function "maximum":
[1] maximum(s::BitSet) in Base at bitset.jl:417
[2] maximum(r::AbstractUnitRange) in Base at range.jl:572
[3] maximum(r::AbstractRange) in Base at range.jl:574
[4] maximum(B::BitArray) in Base at bitarray.jl:1650
[5] maximum(x::SparseArrays.AbstractSparseArray{T,Ti,1} where Ti) where T<:Real in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1307
[6] maximum(a::AbstractArray; dims) in Base at reducedim.jl:648
[7] maximum(::typeof(abs), x::SparseArrays.AbstractSparseArray{Tv,Ti,1} where Ti where Tv) in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1329
[8] maximum(::typeof(abs2), x::SparseArrays.AbstractSparseArray{Tv,Ti,1} where Ti where Tv) in SparseArrays at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/SparseArrays/src/sparsevector.jl:1329
[9] maximum(a) in Base at reduce.jl:487
[10] maximum(f, a::AbstractArray; dims) in Base at reducedim.jl:649
[11] maximum(f, a) in Base at reduce.jl:470
This might help you discover specific methods which you didn’t know about (in this case, methods 10 & 11 are what you’re looking for). Now, if you somehow get a piece of code that works (such as mbauman’s answer above), and would like to understand which method it uses, then:
julia> @which maximum(reverse, list)
maximum(f, a::AbstractArray) in Base at reducedim.jl:649
So this tells you that in this method, the second argument is the array; the first argument is not restricted to any type (but in practice should be callable). This is to allow for the special do syntax, which always acts on the first argument of the method.
I simply meant that, while partialsort(v, n) should give the same result as sort(v)[n], it is generally faster because you don’t need to sort the whole array. Anyway, if the number of elements is in the thousands, all sensible solutions should take almost no time (I would imagine less than a millisecond).
Maybe this is just an issue with your example, but if you care about performance at all, you should not use an untyped array (list):
julia> arr = [(x, y, x*y) for x in -2:2 for y in -2:2]
julia> @btime maximum(last, $list)
1.480 μs (0 allocations: 0 bytes)
julia> @btime maximum(last, $arr)
17.783 ns (0 allocations: 0 bytes)