# Rank with ties

Suppose I have a vector containing the following integers: [1 1 1 2 3 5 5 9]

Now I wish to create a vector that ranks the integers in the following way:

rank = [1 1 1 2 3 4 4 5]

The rank essentially answers the question for each integer “number of unique integers larger + 1” but implementing that is causing some problems. Maybe there is a function already?

This is my attempt, but how do I broadcast it to the vector at once?

length(unique(A[A .< A])) + 1

See

function `sampleRanks`: GitHub - oheil/NormalizeQuantiles.jl: NormalizeQuantiles.jl implements quantile normalization

An elementary way would be

``````x = [1,1,1,2,3,5,5,9]
xu = unique(x)
ranks = [ count( xu .< z ) + 1 for z in x ]
``````

(Of course, @oheil’s package does more than that, i.e. it also takes care of edge cases and supports matrix inputs etc.)

Thats why it’s typically slower…

`StatsBase.denserank` seems to be exactly what you need.

3 Likes

@SteffenPL’s function (and any implementation based on `count`) will have worst-case O(n^2) complexity (when all the entries of `x` are different).

But by sorting `x` out the outset, it is possible to do this in O(n \log n)-time using something like this:

``````function denserank(x::Vector)
p = sortperm(x)             # to recover sort later
permute!(x, p)
res = ones(Int, length(x))
currentrank = 1
for i in 2:length(x)
if x[i-1] < x[i]
currentrank += 1
end
res[i] = currentrank
end
invpermute!(res, p)
end
``````
``````julia> x = [1, 6, 3, 3, 7, 1, 1]; denserank(x)
7-element Vector{Int64}:
1
3
2
2
4
1
1
``````

This is the approach used by `StatsBase.denserank`, albeit as the source code here reveals, there are a few layers of abstraction in between so it can support more than just `Vector` input.

In summary, you should either use `StatsBase.denserank`, or if you roll your own function, base it on a sort algorithm instead of than `count`.

1 Like