How to (efficiently) find the set of maxima of an array?

I want to find the set of maxima of an array. More precisely, I want the set of their indices. For example, from

A = [1 2 3 4 1 2 3 4 1 2 3 4]

I want to obtain something similar to

3-element Vector{Int64}:
4
8
12

I was able to get this result by using

getindex.(findall(A .== maximum(A)),2)

3-element Vector{Int64}:
4
8
12

but I would like to have something more efficient. In my code, this instruction needs to be repeated millions of times. My solution is slow and it allocates too much memory.

If you can give a upper bound of output length, and it’s not large (<= 20 for example), you might try

findmaxs(A::AbstractArray) = begin
    inds = Vector{eltype(keys(A))}(undef, 20)
    maxval = typemin(eltype(A))
    n = 0
    @inbounds for i in keys(A)
        Ai = A[i]
        Ai < maxval && continue
        if Ai > maxval
            maxval = Ai
            n = 1
            inds[n] = i
        else
            inds[n+=1] = i
        end
    end
    resize!(inds, n)
end

For inputs like A = rand(1:1000, 100, 100), this save about 18% time cost on my desktop. (I have to say the space for optimization is quite small). If you accept linear index, replacing keys with eachindex seems better.

3 Likes

Using StaticArrays gives your example a boost:

f1(A) = getindex.(findall(A .== maximum(A)),2)

using StaticArrays
A = SA[1 2 3 4 1 2 3 4 1 2 3 4]
@btime f1($A)   # 52 ns (2 allocations: 192 bytes)

A = [1 2 3 4 1 2 3 4 1 2 3 4]
@btime f1($A)   # 114 ns (4 allocations: 304 bytes)
1 Like

Thanks for your help! Your algorithm works quite well. Since I am looping this instruction millions of times, this really makes a difference.

The problem is I cannot use a static array in my code. The array updates at each iteration, and in each iteration I must search for the set of maxima!

I think this restriction could be lifted as follows:

function findmaxs2(A::AbstractArray)
    inds = eltype(keys(A))[]
    maxval = typemin(eltype(A))
    n = 0
    push!(inds, CartesianIndex(0,0))
    @inbounds for i in keys(A)
        Ai = A[i]
        Ai < maxval && continue
        if Ai > maxval
            maxval = Ai
            inds[1] = i
            n = 1
        else
            n += 1
            insert!(inds, n, i)
        end
    end
    resize!(inds, n)
end

A = rand(1:1000, 100, 100);
findmaxs(A) == findmaxs2(A)  # true

@btime findmaxs($A)    # 6.060 μs (1 allocation: 400 bytes)
@btime findmaxs2($A)   # 6.180 μs (3 allocations: 880 bytes)

You may want to consider preallocating the array if that is being called millions of times. And return the number of elements as well, or a view of the preallocated arrays. Just change @N5N3 solution to:

julia> findmaxs(A::AbstractArray;inds=Vector{eltype(keys(A))}(undef,20)) = begin
           maxval = typemin(eltype(A))
           n = 0
           @inbounds for i in keys(A)
               Ai = A[i]
               Ai < maxval && continue
               if Ai > maxval
                   maxval = Ai
                   n = 1
                   inds[n] = i
               else
                   inds[n+=1] = i
               end
           end
           return @view(inds[1:n])
       end
findmaxs (generic function with 1 method)

julia> A = rand(1:1000, 100, 100);

julia> inds=zeros(eltype(keys(A)),20); # now this can be of any size

julia> @btime findmaxs($A,inds=$inds)
  13.148 μs (0 allocations: 0 bytes)
10-element view(::Vector{CartesianIndex{2}}, 1:10) with eltype CartesianIndex{2}:
 CartesianIndex(21, 7)
 CartesianIndex(98, 23)
 CartesianIndex(63, 37)
 CartesianIndex(24, 49)
 CartesianIndex(54, 60)
 CartesianIndex(55, 62)
 CartesianIndex(19, 75)
 CartesianIndex(52, 76)
 CartesianIndex(63, 95)
 CartesianIndex(21, 99)

This is not necessarily faster for one call of the function, but for a loop, it will be.