Suppose I have a vector containing the following integers: [1 1 1 2 3 5 5 9]
Now I wish to create a vector that ranks the integers in the following way:
rank = [1 1 1 2 3 4 4 5]
The rank essentially answers the question for each integer “number of unique integers larger + 1” but implementing that is causing some problems. Maybe there is a function already?
This is my attempt, but how do I broadcast it to the vector at once?
@SteffenPL’s function (and any implementation based on count) will have worst-case O(n^2) complexity (when all the entries of x are different).
But by sorting x out the outset, it is possible to do this in O(n \log n)-time using something like this:
function denserank(x::Vector)
p = sortperm(x) # to recover sort later
permute!(x, p)
res = ones(Int, length(x))
currentrank = 1
for i in 2:length(x)
if x[i-1] < x[i]
currentrank += 1
end
res[i] = currentrank
end
invpermute!(res, p)
end
This is the approach used by StatsBase.denserank, albeit as the source code here reveals, there are a few layers of abstraction in between so it can support more than just Vector input.
In summary, you should either use StatsBase.denserank, or if you roll your own function, base it on a sort algorithm instead of than count.