# Getting bins from Plots.jl histogram

I can make a histogram plot by

``````histogram(rand(1000))
``````

Can I get the values of the resulting binning?

Check StatsBase.jl:

``````using StatsBase, Plots
h = fit(Histogram, rand(100), nbins=10)
plot(h)
``````

Then access bin values with `h.weights`:

``````julia> h
Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}
edges:
0.0:0.1:1.0
weights: [7, 9, 9, 15, 12, 6, 10, 18, 6, 8]
closed: left
isdensity: false
``````
5 Likes

I recall from a So question I answered some time ago that the book construction is different in plots and StatsBase (histogram2d - Return the frequency in a bin of a 2D histogram in Julia - Stack Overflow) - not sure that was ever aligned

3 Likes

@nilshg, thanks for the link. For 2D histograms, the StatsBase functionality seems to be more limited, with no user control on the 2d bins (at least it is not documented).

``````# computing 2D histograms
data = (randn(10_000), randn(10_000))
h = fit(Histogram, data, nbins=40)  # how to finely control on size of 2d-bins ??
y = diff(h.edges)/2 .+ h.edges[1:end-1]
x = diff(h.edges)/2 .+ h.edges[1:end-1]
heatmap(x, y, h.weights)
``````

yes, they don’t do the same all the time @OvidiusCicero, please provide an example for 1D where problem arises, it will be useful.

It’s difficult to provide the simulation data but I need to manually set `nbins` to get the same number of bins (this makes a pipeline more difficult)

at the moment I plot the result of `fit` with bar

I could not find obvious problems but did not try hard enough.
See example of perfect overlap below:

``````using StatsBase, Plots
data1d = rand(100)
histogram(data1d, bins=10, label="histogram plot", legend=:topleft, ylims=(0,20))
h = fit(Histogram, data1d, nbins=10)
plot!(h, seriestype=:steps, lw=3, lc=:blue, label="StatsBase histogram")
savefig("histogram2d_vs_StatsBase_histogram.png")
`````` 1 Like

You can get 2D histograms with GMT

``````using GMT

# Compute a grid with counting's
G = blockmean(rand(100,2) * 100, region=(0,100,0,100), inc=10, npts=:n, grid=true);
# Plot it
bar3(G, fill=[0,115,190], lw=0.25, fmt=:png, show=true)
# Convert to x,y,z. Empty cells would have NaNs, the *skip_NaN* option takes care of it
D = grd2xyz(G, skip_NaN=true));
# Data is in the *data* field
D.data
65×3 Matrix{Float64}:
20.0  100.0  1.0
30.0  100.0  1.0
50.0  100.0  1.0
70.0  100.0  1.0
90.0  100.0  1.0
0.0   90.0  2.0
...
``````

2 Likes

Sorry, something is not right. It’s not giving the countings.

EDIT; Now it is but I have to see why the `npts=true` alone had not worked as it should.

1 Like

I can’t make a minimal working example out of it but the default binning of Ploots.jl is different from StatsBase so you need to explicitly select `nbins` and `bins` respectively to get the same results.

Otherwise, the selected solution works

1 Like

Thank you. Is it however usual in Julia that the returned data structures are so “bloated”?

In order to get to the numerical values of the bins, I had to resort to type

``````collect(h.edges)
``````

after discovering that h.edges is a 1-element Tuple, containing a range. And yet, it is not even the same length of h.edges @mgiugliano, the developpers of StatsBase will have the correct answer to your question.
As an user, I can only remark that from the source code annotation: edges are an iterator that contains the boundaries of the bins in each dimension, which for the 1D case corresponds to objects like (example): `(0.0:0.1:1.0,)`. But do not know why it has been defined this way.

1 Like

I don’t why it was defined that way either, but in contrast to “bins,” which appears to be treated as a suggestion, inputting edges (as a “StepRangeLen” tuple in the StatsBase Histogram “fit”) fixes the bin size, which is a nice feature, and it can also be used in two dimensions, which you mentioned n the post, above, from 3 Feb 2021, e.g.,

``````firstEdge = 0.0
lastEdge = 10.0
binSize = 1.0
EdgeRange = (firstEdge:binSize:lastEdge)
h = fit(Histogram, (x, y), weights(W), (EdgeRange, EdgeRange))

``````

Did you find that if you first plot just h directly, “plot (h)”, and compare to using h.weights an input to heatmap (again the 3 Feb post), x and y are flipped? Seems crazy, but I had to “transpose(h.weights)” to get the same orientation in the heatmap. Am I missing something? I’m using Julia 1.7.0