Sort huge array get smallest 1000


#1

Is there a function which sorts an array up to a given size?
I know that quicksort divides the array into “halves” every time which should make it easy to get the smallest 1000 in an array of a million values. Does someone already implemented something like this?


#2

What you want would be select(x, range). It returns sorted elements in x[range] if x is sorted:

julia> x = shuffle(collect(linspace(0, 1, 50)));

julia> select(x, 1:10)
10-element Array{Float64,1}:
 0.0
 0.0204082
 0.0408163
 0.0612245
 0.0816327
 0.102041
 0.122449
 0.142857
 0.163265
 0.183673

Note that this function will be renamed to partialsort on Julia 1.0.


#3

partialsort is definitely the better name :wink:

?
x doesn’t have to be sorted, right? Or maybe I just misunderstand you. Anyway. Thanks :slight_smile:


#4

Oh and is there a function which gives me the indices then?


#5

You’re right. select(x, range) returns the same value of sort(x)[range].

If you want indices, you can use selectperm instead, which will be renamed to partialsortperm on Julia 1.0.


#6

You can also use sort(..., alg=PartialQuickSort(1000)) to sort the array, and I think you can use sortperm with PartialQuickSort to get the indices.

https://docs.julialang.org/en/stable/stdlib/sort/#Sorting-Algorithms-1