How do I plot the estimated cumulative density function of some samples?

I have some samples of a random variable, and I want to plot their cdf. What’s the best way to do this?
I have currently tried:

    g1 = [E(λ) for i = 1:n] # generate the samples
    using StatsBase
    cdf = ecdf(g1)
    xs = 0:10^-3:5
    display(plot(y = [cdf(i) for i in xs], x = xs) #gadfly

But this seems rather unidiomatic and inelegant. Any better ideas? Also, this only works for continuous variables, I think.

I don’t think you need anything fancy here:

using Plots
n = 50
g1 = [randn() for i = 1:n] # generate the samples

p = plot(sort(g1), (1:n)./n, 
    xlabel = "sample", ylabel = "Probability", 
    title = "Empirical Cumluative Distribution", label = "")

image

4 Likes

You could also use the plot(function, lower, upper) method from Plots:

using Plots, StatsBase
g1 = randn(100)
gcdf = ecdf(g1)
plot(x -> gcdf(x), 0, 5)

There’s also an example (Geom.histogram) in the Gadfly docs showing a sample and theoretical pdf (the latter is done using a layer with function).

x -> gcdf(x) is the same as just gcdf
:slightly_smiling_face:

2 Likes

Normally yes, but the ECDF object is not a function, so the plot doesn’t know to dispatch on it as such:

julia> plot(gcdf, 0, 5)
ERROR: Cannot convert ECDF{Array{Float64,1},Weights{Float64,Float64,Array{Float64,1}}} to series data for plotting
1 Like

is there a way to change the method signature of plot (either Gadfly’s or Plots’) to accommodate function-like objects like ECDF?