Hi, I’m trying to plot an approximation of a probability density function (as a line plot) from a list of samples. Is this supported by Plots.jl?
There is the
:stephist series type, but it makes steps rather than a line plot:
values = rand(Normal(), 100000);
plot(values, seriestype=:stephist, size=(600,200))
:scatterhist which is closer to what I want, but I couldn’t find how to make it show a line:
plot(values, seriestype=:scatterhist, linestyle=:solid, size=(600,150))
I can do it manually:
h = fit(Histogram, values, nbins=100)
r = h.edges
x = first(r)+step(r)/2:step(r):last(r)
plot(x, h.weights, size=(600,150))
But is there a way to do it directly with Plot.jl? For simplicity, and to take advantage of Plot.jl’s smart bin selection…
I think what you are looking for is either a fit (like a Gaussian fit in this case), see one of my older posts: Fitting a 1D distribution using Gaussian Mixtures or a kernel density estimation https://github.com/JuliaStats/KernelDensity.jl
This task should not be done by a plotting library if you ask me
Thanks for pointing KernelDensity.jl, that looks quite useful.
Here however I’m not trying to fit a model, it’s really about generating random numbers from an arbitrary distribution and showing what the density looks like. It’s for an introductory lesson in probability, I’d like to keep things as basic as possible (the Julia code in particular should be as pedestrian as possible ).
Maybe I miss the point, but I would not teach students to artificially smoothen data of a distribution they should rather learn why it’s not smooth etc.
…but as said, I don’t know what you are up to.
julia> using StatsPlots
It’s just not the subject of that particular lesson (ideally I would do without sampling, showing “perfect” curves). But I agree with the general point.
Oh and if by “perfect curves” you mean theoretical pdfs, StatsPlots also has receipes for Distributions:
julia> using Distributions, StatsPlots
StatsPlots.density is perfect for the job!
For the theoretical pdfs: yep that’s what I meant (except I’m looking at functions of several random variables, so the distribution is not available in Distributions.jl).
A histogram is already an artificial way of de-smoothing a distribution.
You can also plot a histogram and overlay the density as a line without using
using Distributions, Plots
dist = Normal(0, 1)
data = rand(dist, 1000)
plot!(x->pdf(dist, x), xlim=xlims())