How to get the cartesian indexes of the k largest elements of a matrix (2D array) efficiently?

Hi! I would like to find a way to get the cartesian indexes of the k largest elements in a matrix (2D array). Do you have suggestions of a computationally efficient method to achieve that?

Example: for k=3 with the matrix A below, the cartesian indexes would be [(1, 3), (2, 2), (2, 3)].

julia> A = [1 1 3; 1 3 3]
2×3 Matrix{Int64}:
 1  1  3
 1  3  3

Thanks!

What Is the result expected for k=2?

I guess either ties are broken arbitrarily, or the first element is returned (I am interested in both solutions). By first element, I mean the first reached when going through the linear indexing of the matrix A (from 1 to 6 in this example).

Maybe something like

c = Vector{CartesianIndex{2}}(undef,k)
for i in lastindex(M):lastindex(M)-k
    c[i] = CartesianIndex...
end

(Sorry, on the cell phone…)

I am unsure I understood, but this made me think of 1) flatten the array, 2) sortperm to get the largest elements and 3) retrieve the cartesian indexes from the k last elements? This would give something like:

function klargest_indexes(m, k)
    ci = CartesianIndices(size(m))
    p = sortperm(vec(m))[end-k+1:end]
    ci[p]
end
k = 2
m = [1 2 3; 4 5 5; 5 4 3]
out = klargest_indexes(m, k)

In such a case, ties yield the last elements in the linear indexing. Is this what you hinted? Is it computationally efficient? I get a @btime of 5.374 μs (6 allocations: 3.55 KiB) for m = rand(20, 20), k = 2.

I meant probably something like this:

julia> ci = CartesianIndices(size(A))
       for i in lastindex(A)-2:lastindex(A) 
           @show ci[i]
       end
ci[i] = CartesianIndex(2, 2)
ci[i] = CartesianIndex(1, 3)
ci[i] = CartesianIndex(2, 3)


(you can by default run over a matrix as if it was a linearly indexed)

use instead

sortperm(vec(m), 1:k; rev=true)

I think you mean partialsortperm(vec(m), 1:k; rev=true)

4 Likes

Thanks! This is indeed more efficient!

Am I missing something, or this is much more efficient?

julia> A = [1 1 3; 1 3 3]
2×3 Matrix{Int64}:
 1  1  3
 1  3  3

julia> function getci(M,k)
           ci = CartesianIndices(size(M))
           cis = Vector{CartesianIndex{2}}(undef,k)
           j = 0
           for i in lastindex(M)-k+1:lastindex(M)
               j += 1
               cis[j] = ci[i]
           end
           return cis
       end
getci (generic function with 1 method)

julia> @btime getci($A,3)
  32.426 ns (1 allocation: 128 bytes)
3-element Vector{CartesianIndex{2}}:
 CartesianIndex(2, 2)
 CartesianIndex(1, 3)
 CartesianIndex(2, 3)


vs

julia> function klargest_indexes(m, k)
           ci = CartesianIndices(size(m))
           p = partialsortperm(vec(m), 1:k; rev=true)
           ci[p]
       end
klargest_indexes (generic function with 1 method)

julia> @btime klargest_indexes($A,3)
  329.935 ns (6 allocations: 384 bytes)
3-element Vector{CartesianIndex{2}}:
 CartesianIndex(2, 2)
 CartesianIndex(1, 3)
 CartesianIndex(2, 3)

I think the sorting is lacking in the proposed getci function. Correct me if I am wrong but I believe it to yield the cartesian indexes of the k last elements rather than the k largest elements.

1 Like

Ah, I see. In the example you posted those are the same, and I misunderstood your question. Thanks.

(I think you can do that without any sorting, just with an auxiliary array of size k, and running once over the elements of the matrix storing the indexes of the k larger)

2 Likes

There you go:

julia> function getci!(M,k,cmins,vmins) # inplace
           ci = CartesianIndices(size(M))
           for i in 1:k
               cmins[i] = ci[i]
               vmins[i] = M[i]
           end
           imin = findmin(vmins)[2]
           for i in firstindex(M)+k:lastindex(M)
               if M[i] > vmins[imin]
                   cmins[imin] = ci[i]
                   vmins[imin] = M[i]
                   imin = findmin(vmins)[2]
               end
           end
           return cmins, vmins
       end

       function getci(M,k) # allocating
           cmins = Vector{CartesianIndex{2}}(undef,k)
           vmins = Vector{eltype(M)}(undef,k)
           return getci!(M,k,cmins,vmins)
       end
getci (generic function with 1 method)

julia> A = [3 1 1
            1 3 3]
2×3 Matrix{Int64}:
 3  1  1
 1  3  3

julia> @btime getci($A,3)
  78.857 ns (2 allocations: 240 bytes)
(CartesianIndex{2}[CartesianIndex(1, 1), CartesianIndex(2, 2), CartesianIndex(2, 3)], [3, 3, 3])

julia> cmins=Vector{CartesianIndex{2}}(undef,3); vmins=Vector{Int}(undef,3);

julia> @btime getci!($A,3,$cmins,$vmins)
  35.864 ns (0 allocations: 0 bytes)
(CartesianIndex{2}[CartesianIndex(1, 1), CartesianIndex(2, 2), CartesianIndex(2, 3)], [3, 3, 3])

(since here there is an auxiliary array with the associated values, why not return it)

(for rand(20,20) and k=2, an example you mentioned this is 10 times faster than the alternative:

julia> A = rand(20,20); k = 2;

julia> @btime klargest_indexes($A,2)
  5.540 μs (8 allocations: 3.56 KiB)
2-element Vector{CartesianIndex{2}}:
 CartesianIndex(14, 8)
 CartesianIndex(10, 10)

julia> @btime getci($A,2)
  549.492 ns (2 allocations: 208 bytes)
(CartesianIndex{2}[CartesianIndex(10, 10), CartesianIndex(14, 8)], [0.9941985771967297, 0.9992607929792117])

julia> cmins=Vector{CartesianIndex{2}}(undef,k); vmins=Vector{eltype(A)}(undef,k);

julia> @btime getci!($A,2,$cmins,$vmins);
  509.870 ns (0 allocations: 0 bytes)

2 Likes