So apparently this function is very confusing for multiple reasons.
Just to be clear, the name sortperm!(perm, x)
suggests: sort a permutation of the indices of a vector in-place. Roughly it’s equivalent to sort!(perm, by = i -> x[i])
.
Now suppose you have a permutation perm
of the indices of a vector x
. By definition that means length(perm) == length(x)
and isperm(perm)
is true.
In this case, one source of confusion is that sortperm!(perm, x)
and partialsortperm!(perm, x, range)
will by default initialize perm
as the trivial permutation 1:length(perm)
. You have to explicitly signal that perm
is a valid permutation already using the initialized = true
kwarg. This behavior is however convenient, because now you do not have to create a dummy permutation yourself, you can just pass in a buffer.
In your example setting initialized = true
will get you into trouble, since perm
was not a valid permutation and it will access x
out of bounds.
Now, the second source of confusion is that because partialsortperm(x, 1:k)
returns a subset of a permutation of length k
that orders x
in the range 1:k
, one should therefore pass a buffer perm
of size k
to partialsortperm!(perm, x, 1:k)
. This is not true! partialsortperm!(perm, x, 1:k)
partially sorts a full permutation in-place, meaning you have to give pass in a buffer of size length(x)
.
Finally, the third source of confusion is that partialsortperm!(perm, x, range, initialized = true)
does not throw whenever length(perm) != length(x)
. The function still does something when the perm
you pass is in fact not a permutation of the indices of x. This is maybe an oversight in Base. However, what it does is still sensible as long as you pass in a vector of valid indices. The invariant is then:
ix = [1,2,5,7,2,1,2,1,2,1] #notice length(x) != length(ix) and isperm(ix) == false
x = collect(10:-1:1)
k = 1:3
@test x[partialsortperm!(ix, x, k, initialized=true)] == sort(x[ix])[k]
So what is does is, it partially sorts a vector of indices.
Right now I think it would make sense to rename the functions to sortindices!
and partialsortindices!
and make initialized = true
the default.
Or have both (partial)sortindices!
and (partial)sortperm!
, and make the latter more strict when it comes to the input. However, as far as I know, these function do not exploit the fact that a permutation is being passed to the function.