Rank with ties

Nash · July 20, 2022, 6:52am

Suppose I have a vector containing the following integers: [1 1 1 2 3 5 5 9]

Now I wish to create a vector that ranks the integers in the following way:

rank = [1 1 1 2 3 4 4 5]

The rank essentially answers the question for each integer “number of unique integers larger + 1” but implementing that is causing some problems. Maybe there is a function already?

This is my attempt, but how do I broadcast it to the vector at once?

length(unique(A[A .< A[8]])) + 1

oheil · July 20, 2022, 7:03am

See

function sampleRanks: GitHub - oheil/NormalizeQuantiles.jl: NormalizeQuantiles.jl implements quantile normalization

SteffenPL · July 20, 2022, 7:14am

An elementary way would be

x = [1,1,1,2,3,5,5,9]
xu = unique(x)
ranks = [ count( xu .< z ) + 1 for z in x ]

(Of course, @oheil’s package does more than that, i.e. it also takes care of edge cases and supports matrix inputs etc.)

oheil · July 20, 2022, 7:20am

Thats why it’s typically slower…

aplavin · July 20, 2022, 8:22am

StatsBase.denserank seems to be exactly what you need.

maxkapur · July 21, 2022, 12:43am

@SteffenPL’s function (and any implementation based on count) will have worst-case O(n^2) complexity (when all the entries of x are different).

But by sorting x out the outset, it is possible to do this in O(n \log n)-time using something like this:

function denserank(x::Vector)
    p = sortperm(x)             # to recover sort later
    permute!(x, p)
    res = ones(Int, length(x))
    currentrank = 1
    for i in 2:length(x)
        if x[i-1] < x[i]
            currentrank += 1
        end
        res[i] = currentrank
    end
    invpermute!(res, p)
end

julia> x = [1, 6, 3, 3, 7, 1, 1]; denserank(x)
7-element Vector{Int64}:
 1
 3
 2
 2
 4
 1
 1

This is the approach used by StatsBase.denserank, albeit as the source code here reveals, there are a few layers of abstraction in between so it can support more than just Vector input.

In summary, you should either use StatsBase.denserank, or if you roll your own function, base it on a sort algorithm instead of than count.

Topic		Replies	Views
Ranking of elements of a vector General Usage vector , ranking	13	1603	October 7, 2022
How to find rankings of each tuple in the list of tuples? General Usage question	7	626	December 9, 2022
When appllying roll rank function, how much faster Julia can be compared to Python? New to Julia	19	630	March 18, 2022
Efficient way to apply monotonous rank transformation to an array Performance	4	473	April 16, 2020
R function `duplicated` General Usage function	14	711	December 16, 2022

Rank with ties

Related topics