Opposite of StatsBase.percentile

This function returns a NUMBER for a given PERCENTILE.

I don’t know what the OP had in mind, but e.g. what I am looking for right now is exactly the thing that the name of this thread points to and which is the very opposite of the percentile function, i.e. given numbers in a collection, I would like to know their percentiles.

E.g. I have a long vector of numbers, and I would like to create another vector that contains percentiles of respective numbers in the original vector. How can I do that without writing a long program?

are you looking for StatsBase.ecdf?

1 Like

percentilerank (or quantilerank). Also in StatsBase.

julia> v = [1,1,1,1,2,3,5,7]

julia> percentilerank.(Ref(v), [2,3])
2-element Vector{Float64}:
 57.14285714285714
 71.42857142857143
1 Like

The definition of this function includes a ‘vector of samples’. But what when I have just one sample, which is the whole distribution by the way?

I tried, so

julia> ecdf([4;2;5;1;6;7]) ECDF{Vector{Int64}, Weights{Float64, Float64, Vector{Float64}}}([1, 2, 4, 5, 6, 7], Float64[])

This does not return any percentiles to me. What do I do wrong?

Many thanks, that’s roughly what I was looking for!

I had reproducing this plot in mind.

But it’s fairly easy to do it given percentilerank, thanks!

1 Like

The percentilerank is just fine, but to answer your question:

f = ecdf(X) creates a function (similar to an interpolated percentilerank). You can use it later as f(y) or f.(y) where y are some values (eg same as X)

2 Likes