I have an application where, as a subroutine, I need to (partial) sort a specific n-dimensional vector repeatedly (for n>10^7). Given initial data x^{(0)} and some perturbation \epsilon^{(i)} in step i, I want to obtain x^{(i)} = \texttt{sort}(x^{(i-1)} + \epsilon^{(i)}) and the permutation \pi such that (x^{(i-1)} + \epsilon^{(i)})_\pi = x^{(i)}.
So, I am repeatedly (partial) sorting a vector of the same dimension. Based on the discussion here, it was suggested to use a custom data structure. What I have done so far is to pre-allocate x = zeros(Float64, n), e = zeros(Float64, n), and p = zeros(Int64, n) and then call sortperm! (or partialsortperm!) on my data in step i. Is there a data structure that I could use in this case to speed up the sortperm!?