I am wondering whether the naming of StatsBase and Statistics is entirely logical. StatsBase, despite its name, is the “more optional” package. Statistics is included in the base. StatsBase needs to be Pkg.added. maybe they are misnamed?!
StatsBase precedes the reorganization of
Base code to
Statistics. It is still the basic statistics package for things that go beyond the standard library.
StatsBase is most likely going to be moved in part to Statistics, and in part to other packages (like StatsModels). See e.g. https://github.com/JuliaLang/julia/pull/27152.
if I understand correctly, julia will be evaporating StatsBase, so its misleading name will disappear. this is a good choice IMHO.
Almost 3 years later, I wonder if there is any updates on the future of
Statistics. It was very confusing to me when I first started using Julia last year: for example, if I only imported
StatsBase, I need to do
StatsBase.std([1,2,3]) to find the standard deviation, and the reason is of course to avoid conflict with
Statistics. I imagine it will also be a point of confusion for many other newcomers.
Not if they come from Python
For people from python probably there is not even a need to ever load ‘Statistics’ as it appears that all the functionalities in ‘Statistics’ can be found in ‘StatsBase’ or ‘Distributions’.
‘Statistics’ is too small for their taste.
This is not strictly correct.
Statistics.std. The reason is that StatsBase imports Statistics (but does not re-export it), so
std is in the namespace of StatsBase .
julia> using StatsBase julia> @which StatsBase.std Statistics
So there is no need to distinguish between
Statistics.std. They are the same function.
There is no deep reason why StatsBase still exists in it’s current form and we haven’t put more things into Statistics. It’s simply a very tedious process that requires a lot of time and effort.
I think one of the reasons is the implicit cost of a commitment to maintenance and a specific API that comes with standard library version being tied to Julia. See this summary here.
While arguably some functionality could be moved from StatsBase to Statistics, it is unlikely that the API for the whole package is that finalized and/or belongs in a standard library. OTOH, there is no pressing need to move anything, after all there is a package and people can just use it.