The quasi-best MaxN function

I have not found among the library functions a function that finds the first N maximum values of a list. I tried to define some functions and make performance comparisons. I found two kinds of problems: one was to find an effective algorithm; a second how to measure the performance between different algorithms. I would like to know if there is a similar function in some package or if some algorithm is known that works “better” than others. I would like to have information on how, with the same algorithms proposed, I can improve performance. Since one of the algorithms “consumes” the list to which it applies and this can change the result of @btime I had to insert an item that copies the input vector in the maxN3 function, which would not be strictly necessary for the function’s task .
How could I compare performance on equal terms, or alternatively, how could I change the function to have the same algorithm but that doesn’t “consume” the list. Then, for example, replace pop! with other functions, but without penalizing too much performance.

r=rand(1:10^9, 10^7)


maxN1(s,N,lst=[]) = N==0 ? lst : maxN1(setdiff(s,maximum(s)),N-1,push!(lst,maximum(s)))
maxN2(s,N)= sort(s,rev=true)[1:N]
function MaxN3(cr,N)
    maxn=Int[]
    r=copy(cr)
    for x in 1:N
        p=pop!(r)
        for (i,e) in enumerate(r)
            if p < e
                r[i],p=p,r[i]
            end
        end
        push!(maxn,p)
    end
    maxn
end

See partialsort.

4 Likes

Thanks. Based on your indication I found this topic which seems to refer to the library function in question.

I tried to use it in order to obtain the first 5 largest values of a list and I made comparisons between different functions including one derived from that of the base library.

function MaxN4(cr,N)
    maxn=Int[]
    r=copy(cr)
    for x in 1:N
        p=popfirst!(r)
        for (i,e) in enumerate(r)
            if p < e
                r[i],p = p,e
            end
        end
        push!(maxn,p)
    end
    maxn
end

maxN2(s,N)= sort(s,rev=true)[1:N]

MaxNPS(r,N) = sort(r, alg=PartialQuickSort(N),rev=true)[1:N]

Below are the results obtained for the following two vectors

re7=rand(1:10^9, 10^7)

re6=rand(1:10^8, 10^6)

per re6:

using BenchmarkTools

@btime sort(re6,rev=true)[1:5]
# 67.170 ms (3 allocations: 7.63 MiB)
# 5-element Array{Int64,1}:
#  99999902
#  99999827
#  99999775
#  99999613
#  99999409
@btime MaxN4(re6,5)
# 5.630 ms (5 allocations: 7.63 MiB)
# 5-element Array{Int64,1}:
#  99999902
#  99999827
#  99999775
#  99999613
#  99999409

@btime MaxNPS(re6,5)
# 10.661 ms (4 allocations: 7.63 MiB)
# 5-element Array{Int64,1}:
#  99999902
#  99999827
#  99999775
#  99999613
#  99999409

per re7:


@btime sort(re7,rev=true)[1:5]
#   778.337 ms (3 allocations: 76.29 MiB)
# 5-element Array{Int64,1}:
#  999999837
#  999999762
#  999999473
#  999999470
#  999999383

@btime MaxN4(re7,5)
# 62.091 ms (5 allocations: 76.29 MiB)
# 5-element Array{Int64,1}:
#  999999837
#  999999762
#  999999473
#  999999470
#  999999383

@btime MaxNPS(re7,5)
# 79.725 ms (4 allocations: 76.29 MiB)
# 5-element Array{Int64,1}:
#  999999837
#  999999762
#  999999473
#  999999470
#  999999383

Given the results I have the doubt of not having done something properly either in the measurement or in the definition of the function that uses the library partialquicksort.

partialsort is a standard library function and can be used like partialsort(re6, 1:5, rev=true) to obtain 5 largest elements of the array.

partialsort should beat MaxN4 for a bit larger value of N like 15.

For a faster algorithm for a small N you should iterate over the input array only once.

Indeed I’m seeing:

julia> @btime partialsort($re6, 1:15, rev=true);
  7.739 ms (4 allocations: 7.63 MiB)

julia> @btime MaxN4($re6,15);
  12.567 ms (6 allocations: 7.63 MiB)

This implementation, using a heap, seems to be much faster than the other ones posted here for both larger and smaller n. (Has a bit of extra complexity to handle type-unstable iterators, and to provide an in-place variant…I wanted to make it a good starting point for use in a future library.)

using DataStructures

"""
    heapmaxn!(itr, biggest::AbstractVector, n=length(biggest))

Modify `biggest` in-place to an array of the `n` biggest elements of
the union of `itr` and `biggest`, returning the new `biggest` array.
"""
function heapmaxn!(itr, biggest::AbstractVector{T}, n::Integer=length(biggest),::Val{realloc}=Val{false}()) where {T, realloc}
    n < 0 && throw(ArgumentError("$n should be nonnegative"))
    n == 0 && return empty!(biggest)
    sizehint!(biggest, n)
    heapify!(biggest)
    while length(biggest) > n
        heappop!(biggest)
    end
    i = iterate(itr)
    @inbounds while i !== nothing
        v = i[1]
        if length(biggest) < n
            heappush!(biggest, v)
        elseif isless(biggest[1], v)
            heappop!(biggest)
            if realloc && !(v isa T) # handle type-unstable iterator
                newbiggest = copyto!(similar(biggest, typejoin(typeof(v),T)), biggest)
                return heapmaxn!(Iterators.rest(itr, i[2]), heappush!(newbiggest, v), n)
            end
            heappush!(biggest, v)
        end
        i = iterate(itr, i[2])
    end
    return biggest
end

"""
    heapmaxn(itr, n)

Return an array of the `n` biggest elements of `itr`.
"""
function heapmaxn(itr, n::Integer)
    if Base.IteratorEltype(itr) isa Base.HasEltype
        return heapmaxn!(itr, empty!(Vector{eltype(itr)}(undef, n)), n)
    else
        @show i = iterate(itr)
        i === nothing && throw(ArgumentError("unknown eltype for empty collection"))
        n == 0 && return typeof(i[1])[]
        return heapmaxn!(Iterators.rest(itr, i[2]), [i[1]], n, Val{true}())
    end
end

giving (for re6=rand(1:10^8, 10^6) as above):

julia> @btime partialsort($re6, 1:5, rev=true);
       @btime MaxN4($re6,5);
       @btime heapmaxn($re6,5);
  11.785 ms (4 allocations: 7.63 MiB)
  4.660 ms (5 allocations: 7.63 MiB)
  1.504 ms (1 allocation: 128 bytes)

julia> @btime partialsort($re6, 1:15, rev=true);
       @btime MaxN4($re6,15);
       @btime heapmaxn($re6,15);
  11.334 ms (4 allocations: 7.63 MiB)
  15.821 ms (6 allocations: 7.63 MiB)
  1.512 ms (1 allocation: 208 bytes)

julia> @btime partialsort($re6, 1:500, rev=true);
       @btime MaxN4($re6,500);
       @btime heapmaxn($re6,500);
  11.486 ms (4 allocations: 7.63 MiB)
  556.260 ms (11 allocations: 7.64 MiB)
  1.775 ms (1 allocation: 4.06 KiB)

(Though heapmaxn is slower in the worst case where the inputs are sorted in ascending order. Of course, if you know that your inputs are mostly sorted, you could run heapmaxn(Iterators.reverse(itr), n) instead.)

(It’s even faster in Julia 1.6, apparently thanks to some compiler and/or inference improvements.)

5 Likes

Okay, I’m dumb — the DataStructures.jl package already contains such a function, implemented using heaps.

Just call nlargest(n, array) from DataStructures.jl — it seems to be faster than all of the implementations above, including mine, in all the cases above.

(It beats my code apparently because it takes advantage of the lower-level heap function DataStructures.percolate_down!nlargest currently only supports arrays, not arbitrary iterators, but that limitation looks easy to fix if desired.)

7 Likes

following this hint


function MaxN6(cr,N)
    r=copy(cr)
    maxn=sort(r[1:N],rev=true)
       @inbounds for e in r[N+1:end]
            if maxn[end] < e 
                insert(maxn,N,e)
            end
        end
    maxn
end

function insert(arr, N, el)
    for i in 1:N
        if el > arr[i]
            arr[i+1:end]=arr[i:end-1]
            arr[i]=el
            break
        end
    end
    arr
end

(MaxN6 is still much slower than DataStructures.nlargest.)

I had not yet seen your messages and I think it will take me a long time to study and understand them fully.
Thanks for these! (*)

I had no doubt that my naïve solution was minimally comparable with those made by experienced developers.

(*)
I would be grateful if you could explain (me) the logic of the algorithm. I still have a lot of trouble following the language syntax when slightly more advanced constructs are used.

Not so much, actually, if keeping the same idea, one changes implementation. (*)
I changed the footprint of the main function in order to easily try out different insertion strategies for a new entrant.
I believe that by using a more efficient implementation of the insert function (eg using the heap technique. I have to study it, to understand how it works.) to change only the minimum of the maxn list, we can improve the performance a little bit more.

function MaxN6b(cr,N,f)
    maxn = sort!(cr[1:N], rev=true)
       @inbounds for i in N+1:length(cr)
        e=cr[i]    
        if maxn[end] < e
                f(maxn,N,e)
            end
        end
    maxn
end

function insert1(arr, N, el)
    for i in 1:N
        if el > arr[i]
            arr[i+1:end]=arr[i:end-1]
            arr[i]=el
            break
        end
    end
    arr
end

function insert2(arr, N, el)
arr[N]=el
sort!(arr,rev=true)
end

julia> nlargest(20,re6)==MaxN6b(re6,20,insert)
true

julia> nlargest(20,re6)==MaxN6b(re6,20,insert2)
true


julia> @btime MaxN6b($re6,20,insert2);
  975.701 μs (1 allocation: 240 bytes)

julia> @btime MaxN6b($re6,20,insert);
  948.200 μs (205 allocations: 31.13 KiB)

julia> @btime nlargest(20,$re6);
  827.100 μs (2 allocations: 480 bytes)

julia> @btime MaxN6b($re7,20,insert);
  9.942 ms (226 allocations: 36.34 KiB)

julia> @btime MaxN6b($re7,20,insert2);
  9.770 ms (1 allocation: 240 bytes)

julia> @btime nlargest(20,$re7);
  8.908 ms (2 allocations: 480 bytes)

julia> @btime nlargest(20,$re7);


(*)Why does changing the syntax of the for loop change performance so much?

UPDATE:

julia> nlargest(20,re6)==MaxN6c1(re6,20)
true

julia> @btime MaxN6c1($re6,20);
  714.199 μs (205 allocations: 31.13 KiB)

where

function MaxN6c1(cr,N)
    maxn = sort!(cr[1:N], rev=true)
       @inbounds for i in N+1:length(cr)
        e=cr[i]    
        if maxn[N] < e
                insert1(maxn,N,e)
            end
        end
    maxn
end

function insert1(arr, N, el)
    for i in 1:N
        if el > arr[i]
            arr[i+1:end]=arr[i:end-1]
            arr[i]=el
            break
        end
    end
    arr
end

for larger sublist dimensions to be extracted, the efficiency of the insertx function comes into play

julia> @btime nlargest(20,$re7);
  8.740 ms (2 allocations: 480 bytes)

julia> @btime MaxN6c1($re7,20);
  7.707 ms (226 allocations: 36.34 KiB)

julia> @btime MaxN6c1($re7,2000);
  48.597 ms (16780 allocations: 129.84 MiB)

julia> @btime nlargest(2000,$re7);
  9.659 ms (2 allocations: 31.50 KiB)

You can still squeeze a little bit by removing the array access maxn[N] in the loop.

function MaxN6c2(cr,N)
    maxn = sort!(cr[1:N], rev=true)
    maxnN = maxn[N]
    @inbounds for i in N+1:length(cr)
        e=cr[i]
        if maxnN < e
            insert1(maxn,N,e)
            maxnN = maxn[N]
        end
    end
    maxn
end
julia> @btime MaxN6c2($re6, 20);
  555.000 μs (203 allocations: 32.78 KiB)
julia> @btime MaxN6c1($re6, 20);
  735.699 μs (203 allocations: 32.78 KiB)
julia> @btime nlargest(20, $re6);
  858.300 μs (2 allocations: 480 bytes)

(This would probably apply to nlargest as well.)

Using some heap-functions and the suggestions of @mikkoku

function MaxN7(cr,N)
    maxn = heapify!(cr[1:N])
    maxn1=maxn[1]
       @inbounds for i in N+1:length(cr)
        e=cr[i]    
        if maxn1 < e
            heappop!(maxn)
            heappush!(maxn,e)
            maxn1=maxn[1]
            end
        end
    sort!(maxn,rev=true)
end
julia> nlargest(2000,re8)==MaxN7(re8,2000)
true

julia> @btime MaxN7($re6,20);
  576.800 μs (1 allocation: 240 bytes)

julia> @btime MaxN7($re8,2000);
  71.938 ms (1 allocation: 15.75 KiB)

julia> @btime nlargest(2000,$re8);
  86.122 ms (2 allocations: 31.50 KiB)
1 Like