I want to bin numerical values into manually set percentile values, how can I do this?

Hi how’s it going?

I have a numerical column that I’d like to bin into these percentile bin edges:

percentiles = [.05,.35,.8,.95]

In Python it’s as easy as pd.qcut. I was wondering what the simplest way would be for me to do this - I’ve found Julia solutions where it automatically creates the bins for you, but I would like something where I can manually set the bin edges.

Thank you!

Look at cut from CategoricalArrays

3 Likes

Eg something like

function bin_by_percentiles(x, p)
    @assert issorted(p)
    q = quantile(x, p; sorted = true)
    searchsortedfirst.(Ref(q), x)
end

bin_by_percentiles(randn(50), [.05,.35,.8,.95])
1 Like