I have been trying to squeeze out as much performance as possible out from the following function, which finds the n largest (or maximum) values of a matrix and returns those values and their positions in the matrix, however I do have to note that in the specific case in which I am calling the function I really don’t need the values, just the indices:

function maximums(matrix, n)::Tuple{Vector{Int64}, Vector{CartesianIndex{2}}}
type = eltype(matrix)
vals = zeros(type, n)
indices = Array{Int64}(undef,n)
arr = Array{Tuple{type,CartesianIndex}}(undef, n)
@inbounds for i ∈ axes(matrix, 1), j ∈ axes(matrix, 2)
smallest, index = findmin(vals)
if matrix[i, j] > smallest
arr[index] = matrix[i, j], CartesianIndex(i, j)
vals[index] = matrix[i, j]
end
end
arr = sort(arr, by=x -> x[1], rev=true)
vals = [x[1] for x in arr]
indices = [x[2] for x in arr]
return vals, indices
end

I have already profiled it using @code_warntype and pprof and I can’t seem to find any possible room for improvement. What can I do to make it run faster? Or does my logic need a rewrite from the ground up?

I call this function thousands of times in my main function so I would really appreciate any speed up.

Using a better algorithm is of course the best choice, but if you just wanted to tweak your original code, then simply moving the line

if matrix[i, j] > smallest
smallest, index = findmin(vals) # <-- move it here
arr[index] = matrix[i, j], CartesianIndex(i, j)
vals[index] = matrix[i, j]
end

is already a significant speedup. You need to initialize smallest and index outside the main loop, btw.

using DataStructures
function Nlargest(v,N)
maxn = heapify!(tuple.(v[1:N],1:N))
maxn1=maxn[1]
for i in N+1:length(v)
e=(v[i],i)
if maxn1[1] < e[1]
heappop!(maxn)
heappush!(maxn,e)
maxn1=maxn[1]
end
end
#sort!(maxn,rev=true)
maxn
end

Thank you everyone, here is my final working code:

function maximums2(M, n)
v = vec(M)
l = length(v)
ix = [1:l;]
partialsortperm!(ix, v, (l-n+1):l, initialized=true)
vals = v[ix[(l-n+1):l]]
indices = CartesianIndices(M)[ix[(l-n+1):l]]
return vals, indices
end