Thanks @tbeason, I actually found the Ash
function from this other somehow related post:
I understand that the AverageShiftedHistogram is somehow something that does similarly to the kernel density you mentioned, so using that and integrating its output was indeed the first thing I tried as mentioned in the original post.
Do you think this approach is actually different from what you are suggesting?
I was actually trying out the different options in a Pluto notebook:
Notebook Code
# ╔═╡ 21f03ec8-4e76-4e7c-8cac-3e67d9e79e44
begin
using OnlineStats
using PlutoPlotly
using Statistics
end
# ╔═╡ dff0f3f2-6706-4f60-a607-1fcff9b3b314
begin
db2lin(x) = 10.0^(x/10)
q = Quantile()
o = OrderStats(100)
a = db2lin.(randn(10^5) .* 2) # Let's create some lognormal variable
# a = randn(10^4)
fit!(q, a)
fit!(o, a)
end
# ╔═╡ d78b749b-a739-4fcf-a5c2-1c7d41fc11f0
let
d1 = let
# We just build an Average Shifted Histogram for the smoothed pdf of the histogram used by Quantile
ash = Ash(q.eh, 1)
x, y = value(ash) # This extracts x and y for the smoothed pdf
y = cumsum(y)
# eltype to convert the step (which is TwicePrecision) to the actual type of the elements of y
y *= eltype(y)(x.step) # We have to normalize by the step on x to have the sum to 1
scatter(;x,y, name = "ASH")
end
d2 = let
y = range(0, 1; length = 101)
x = map(y) do y
value(q, y)
end
scatter(;x, y, name = "Quantile", line_dash = :dash)
end
d3 = let
y = range(0, 1; length = 101)
x = map(y) do y
quantile(o, y)
end
scatter(;x, y, name = "OrderStats", line_dash = :dot)
end
plot([d1, d2, d3], Layout(;
template = "none",
uirevision = 1,
xaxis = attr(;
title = "X"
),
yaxis = attr(;
title = "P{x <= X}"
)
))
end
I see that Ash works for smoothing but the actual curve from OrderStats is “closer” to the Quantile curve (see below example zoom of the CDF plot of the notebook):