Weird histogram looking

Dear all,
can someone help me to understand why I get this histogram?
agent_adults_last is 1236145-element Vector{Float64} with balues between 7.0 and 25.0
Never happened to me.

it seems that it is a large sample size limit, because this works:


gr()
histogram(agent_adults_last.Lw[1:1000000],
    bins=30
)

thank you

Your issue is that it makes a line plot instead of bars ? I’ve noticed that before but I’m not sure if it’s a bug or a feature…

One solution is to do the plot yourself, which gives you full control on how the its made :

using StatsBase, Plots

data = randn(100_000)
bins = LinRange(-3,3,100)

h = fit(Histogram, data, bins)

x = h.edges[1][1:end-1]
y = h.weights

bar(x,y)
1 Like

I believe that’s a feature, although I can’t find where it’s defined as the whole Plots machinery is very opaque to me

I think the point is simply that the number of bins by default scales with the number of observations, and at some point you are basically getting a density plot rather than a histogram. That point is 1 million observations:

julia> histogram(randn(1_000_000))

julia> histogram(randn(1_000_001))

You can also see the benefit of this here - with lots of bins you have more bin edges than actual bins (although this is exaggerated here because I use linewidth = 2 by default).

Of course this isn’t all that sensible if you set the number of bins yourself…

4 Likes

Basically your histogram got transformed into a stephist: Histograms · Plots. I think it’s actually way cleaner than the usual filled-bars histogram when you have a lot of data, like from simulations. For me, it’s especially useful when I want to plot multiple potentially overlapping histograms. With the usual bars the plot quickly becomes cluttered because everything is overlapping, the bar edges obscure the actual bars etc.

If you’re working with lots of histograms allow me to introduce you to Makie Plotting · FHist.jl

1 Like