First-class histogram object and plotting in Plots.jl

I have a dataset with a large number of points, and would like to plot histograms which are costly to calculate. I would prefer if I could save the calculated histogram using JLD2, then load it in a separate script for plotting. Eg (MWE, this is of course trivially fast)

using JLD2
x = randn(100)
f(x) = x^2
# is there a library which has a representation for a histogram,
# independently of its plot ...?
h = histogram(f.(x))
@save "data.jld2" h

then

using Plots; gr()
using JLD2
@load "data.jld2"
plot(h) # ... that I could then plot?

Simply:

julia> using StatsBase

julia> h = fit(Histogram, f.(x))
StatsBase.Histogram{Int64,1,Tuple{StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}}
edges:
  0.0:2.0:14.0
weights: [837, 115, 36, 11, 0, 0, 1]
closed: left
isdensity: false

julia> plot(h)

should work (there is a recipe to plot Histogram objects created with StatBase). I know it sounds silly but with Plots it’s kind of impossible to know all the recipes that are implemented. As a rule of thumb I generally try something intuitive and end up either very happy with the result or opening a bug report :slight_smile:

5 Likes

As a more general solution, you can use the HDF5 backend:

hdf5() 
p = histogram(randn(100000))
Plots.hdf5plot_write(p, "myhistogram.hdf5")
#then
gr()
pread = Plots.hdf5plot_read("myhistogram.hdf5")

Oh, and

To be fair, you generally end up opening a PR :slight_smile: