So this might be a really stupid question and I have the feeling that it will be. However, I have to ask it because it doesn’t make sense to me.
I have installed and loaded julia v 0.6.2 on my ubuntu linux 17.10 box. I run the following code in the REPL
using StatsBase
?hist
and I get this as output
Couldn’t find hist
Perhaps you meant hist, Chisq, hash, hcat, FDist, TDist, fit, Cint or Dict
No documentation found.
Binding hist does not exist.
So it’s telling me hist couldn’t be found but instead recommends to me that I might be looking for hist?
Follow up question:
I’m trying to achieve the julia version of the following executed in R
plot(hist(rnorm(100)))
What is the Julia way of achieving this?
I think this is because hist
is exported by StatsBase
, but not defined within the module.
In StatsBase you use fit(Histogram, randn(100))
but this does not produce the actual plot (only calculates the bins and weights).
The simplest way to get the plot is:
using StatPlots
histogram(randn(100))
AFAIK hist
was removed from StatsBase
(fit
should be used).
Thank you. I’ll give that a whirl!
So if this is the way to get a Histogram then what does the hist function do? Is it something completely different?
Right, I just meant that the export hist
should be removed
right - I misread your post
It gives you a histogram data but does not plot it:
julia> fit(Histogram, randn(100))
StatsBase.Histogram{Int64,1,Tuple{StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}}
edges:
-3.0:1.0:4.0
weights: [3, 11, 39, 33, 11, 2, 1]
closed: left
isdensity: false
(and hist
export in StatsBase
is a bug that should be fixed as @fredrikekre indicated)
To be exact, you just need Plots (StatPlots is a superset of Plots that includes DataFrames support and some extra recipes useful for plotting with data; but for most users it’s recommended to just use StatPlots rather than Plots anyway).
If you do want to fit a histogram with StatsBase and then plot it, you can do that too.
using Plots, StatsBase
h = fit(Histogram, randn(10000))
plot(h)
But the Plots histogram
call is slightly more featured in that you can use :scott
, :rice
, :fd
(Friedman-Diaconis, the default) and :sturges
algorithms to get the bin breaks (StatPlots also offers :wand
which gives a better fit to a density function but does not align breaks neatly with integer values). StatsBase only uses :sturges
1 Like
Very informative. Thanks for the explanation. Indeed I’m searching for the differences in Julian vs R thinking. After 20 years of C++ and R I’m a little set in my ways. Loving what I’ve seen so far though.