Get smooth CDF from OnlineStats's Quantile

Thanks @tbeason, I actually found the Ash function from this other somehow related post:

I understand that the AverageShiftedHistogram is somehow something that does similarly to the kernel density you mentioned, so using that and integrating its output was indeed the first thing I tried as mentioned in the original post.

Do you think this approach is actually different from what you are suggesting?

I was actually trying out the different options in a Pluto notebook:

Notebook Code
# ╔═╡ 21f03ec8-4e76-4e7c-8cac-3e67d9e79e44
begin
	using OnlineStats
	using PlutoPlotly
	using Statistics
end

# ╔═╡ dff0f3f2-6706-4f60-a607-1fcff9b3b314
begin
	db2lin(x) = 10.0^(x/10)
	q = Quantile()
	o = OrderStats(100)
	a = db2lin.(randn(10^5) .* 2) # Let's create some lognormal variable
	# a = randn(10^4)
	fit!(q, a)
	fit!(o, a)
end

# ╔═╡ d78b749b-a739-4fcf-a5c2-1c7d41fc11f0
let
	d1 = let
		# We just build an Average Shifted Histogram for the smoothed pdf of the histogram used by Quantile
		ash = Ash(q.eh, 1)
		x, y = value(ash) # This extracts x and y for the smoothed pdf
		y = cumsum(y)
		# eltype to convert the step (which is TwicePrecision) to the actual type of the elements of y
		y *= eltype(y)(x.step) # We have to normalize by the step on x to have the sum to 1
		scatter(;x,y, name = "ASH")
	end
	d2 = let
		y = range(0, 1; length = 101)
		x = map(y) do y
			value(q, y)
		end
		scatter(;x, y, name = "Quantile", line_dash = :dash)
	end
	d3 = let
		y = range(0, 1; length = 101)
		x = map(y) do y
			quantile(o, y)
		end
		scatter(;x, y, name = "OrderStats", line_dash = :dot)
	end
	plot([d1, d2, d3], Layout(;
		template = "none",
		uirevision = 1,
		xaxis = attr(;
			title = "X"
		),
		yaxis = attr(;
			title = "P{x <= X}"
		)
	))
end

I see that Ash works for smoothing but the actual curve from OrderStats is “closer” to the Quantile curve (see below example zoom of the CDF plot of the notebook):
image

1 Like