I’m trying to implement an algorithm from a paper which makes use of empirical r-th q-quantiles of the marginals X^{(1)}, ..., X^{(p)} for a dataset X with p features (Bénard et al., 2021).
If I understand this correctly, this means that for each feature in the dataset, q-quantiles should be determined.
Would the following be the correct way to determine an “empirical 3-quantile” for some column in the dataset X^{(u)}?
They give a definition too for the r-th q-quantile a \hat{q}_{n,r}^{(j)} of \{ X_i^{(j)}, ..., X_n^{(j)} \} for r \in \{1, ..., q - 1\}. It is defined in Equation 4.2 in Bénard et al. (2021) by
function _empirical_quantile(V::AbstractVector, quantile::Real)
@assert 0.0 ≤ quantile ≤ 1.0
n = length(V)
index = Int(floor(quantile * (n + 1)))
if index == 0
index = 1
end
if index == n + 1
index = n
end
sorted = sort(V)
return sorted[index]
end
function _cutpoints(V::AbstractVector, q::Int)
quantiles = range(; start=0.0, stop=1.0, length=q)
return _empirical_quantile.(Ref(V), quantiles)
end