Let’s say I have a vector

```
a = rand(1:20, 10)
10-element Vector{Int64}:
2
13
10
12
10
8
10
14
1
19
```

I want to define a function such as f() which takes two inputs including a vector `a`

and `k`

as the number of largest elements and return the k largest numbers in descending order.
For example:

```
f(a,3) = [19, 14 , 13]
```

A similar issue is discussed here in 2018, but it looks like some functions like `partialsortperm()`

do not work in the newer versions of Julia. I use Julia 1.6.5.

Not sure what you mean.

```
julia> a[partialsortperm(a, 1:3; rev = true)]
3-element Vector{Int64}:
20
18
17
```

works just fine on Julia 1.6.

Hi,

does the following function do the trick?

```
f(a,k) = sort(a, rev=true)[1:k]
```

Regards,

Thomas

1 Like

nilshg
October 31, 2022, 1:15pm
#4
`partialsortperm`

is made for these situations, so you might be leaving quite a bit of performance on the table if you sort the whole vector:

```
julia> f1(a, k) = sort(a; rev = true)[1:k];
julia> f2(a, k) = a[partialsortperm(a, 1:k; rev = true)];
julia> using BenchmarkTools
julia> x = rand(1_000_000);
julia> @btime f1($x, 3);
53.254 ms (3 allocations: 7.63 MiB)
julia> x = rand(1_000_000);
julia> @btime f2($x, 3);
5.330 ms (5 allocations: 7.63 MiB)
```

an order of magnitude on my machine for 1e6 Float64 elements.

5 Likes

aplavin
October 31, 2022, 1:27pm
#5
`partialsort(x, 1:k; rev=true)`

is even faster, `perm`

is not needed here.

5 Likes

To sum it up…

```
julia> using DataStructures, BenchmarkTools
julia> f1(a, k) = sort(a; rev = true)[1:k];
julia> f2(a, k) = a[partialsortperm(a, 1:k; rev = true)];
julia> f3(a, k) = partialsort(a, 1:k; rev=true)
f3 (generic function with 1 method)
julia> f4(a, k) = nlargest(k, a)
f4 (generic function with 1 method)
julia> x = rand(1_000_000);
julia> xb = copy(x);
julia> r1 = @btime f1($xb, 3);
74.432 ms (3 allocations: 7.63 MiB)
julia> xb = copy(x);
julia> r2 = @btime f2($xb, 3);
13.480 ms (5 allocations: 7.63 MiB)
julia> xb = copy(x);
julia> r3 = @btime f3($xb, 3);
9.408 ms (2 allocations: 7.63 MiB)
julia> xb = copy(x);
julia> r4 = @btime f4($xb, 3);
1.531 ms (2 allocations: 160 bytes)
julia> r1 == r2 == r3 == r4
true
```

3 Likes

try also this, but for larger arrays

```
function MaxN(cr,N)
maxn = heapify!(cr[1:N])
maxn1=maxn[1]
@inbounds for i in N+1:length(cr)
e=cr[i]
if maxn1 < e
heappop!(maxn)
heappush!(maxn,e)
maxn1=maxn[1]
end
end
sort!(maxn,rev=true)
end
```

DNF
October 31, 2022, 10:38pm
#9
Why doesn’t `partialsort`

use `nlargest`

then? What’s the tradeoff?

Because `nlargest`

discards the rest of the array? The two functions aren’t equivalent.

1 Like

`DataStructures.nlargest`

already uses a heap (in a slightly more efficient way because it can combine the push and pop).