Smoothing empirical CDFs

amrods · September 21, 2022, 4:46am

I have 5 empirical CDFs coming from a model. I’d like to average them and plot that average against the empirical CDF of the data. The issue I’m facing is that the points of support of those 5 CDFs are not the same, so I’d have to employ some kind of window of data and average points in there. Of course, because I want to end up with an average that can be interpreted as a CDF, I’d need the average to be non-decreasing and between 0 and 1. Is there any package that allows me to do that?

juliohm · September 21, 2022, 10:19am

Are you aware of EmpiricalCDFs.jl?

We also have cooked our own empirical CDF type in TableTransforms.jl here

amrods · September 21, 2022, 10:47am

I wasn’t aware of that. I’m thinking that maybe I should just input all the 5 simulations into a single EmpiricalCDF and see how that goes.

Dan · September 21, 2022, 11:24am

Why not just average the 5 CDFs? The average (which is 1/5 of the sum) is a monotone function to [0,1]. This would also be the mathematically reasonable thing to do.

More specifically, a CDF is F(x) = Prob(X < x), and the average CDF would be Favg(x) = Prob(choose X from X_1, X_2… X_5 with equal prob AND X < x)

Or another interpretation is, Favg is the CDF of a variable X obtained by choosing one of the CDFs uniformly and sampling a value according to it.

amrods · September 21, 2022, 11:36am

Yeah, I thought it was a harder problem than that. I also don’t have to worry about the “sample points” of the distribution.

amrods · September 21, 2022, 11:40am

Do you think this is equivalent to going a rather long way by first getting the pdfs, then operating with their Fourier transforms (convoluting them thereby obtaining the distribution of the average), and then obtaining the cdf of that convolution?

Dan · September 21, 2022, 12:20pm

It is theoretically equivalent (given reasonable smoothness assumptions on the initial distributions). But getting to a PDF and convolving would be much harder and can introduce more errors.

If the disributions are initially empirical, then perhaps smoothing may be needed to get closer to underlying generating process.

Topic		Replies	Views
Kernel Density Estimate for cdf General Usage question , statistics	11	1658	September 26, 2022
ANN: EmpiricalCDFs.jl Statistics	0	670	April 18, 2018
Get smooth CDF from OnlineStats's Quantile Statistics	6	207	May 6, 2024
[ANN] EmpiricalCDFs.jl registered and documented Statistics package , announcement	0	684	May 24, 2018
Empirical distribution type for continuous variables Statistics question , proposal	18	4849	February 14, 2023

Smoothing empirical CDFs

Related topics