Hi all,
I am excited to announce that I have just registered Tally.jl, which is a small package to do tallies, which are also known as frequency counts or bar charts (without the chart). It will tally any iterable object you throw at it:
julia> T = tally(["x", "x", "y", "x"])
Tally with 4 items in 2 groups:
"x" | 3 | 75%
"y" | 1 | 25%
julia> T = tally(rand(-1:1, 10, 10)) # a random 10x10 matrix with entries in [-1, 0, 1]
Tally with 100 items in 3 groups:
-1 | 37 | 37%
0 | 36 | 36%
1 | 27 | 27%
There is also some basic plotting functionality for the REPL (UnicodePlots.jl is amazing):
julia> T = tally(rand(-1:1, 10, 10));
julia> Tally.plot(T)
┌ ┐
1 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 38 38%
0 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 34 34%
-1 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■ 28 28%
└ ┘
(if this is mangled due to non-monospace font, have a look at the README)
But one can also feed it into a Plots.bar
plot.
One (for my applications) interesting functionality is the ability to count with respect to an arbitrary equivalence relation. In a nutshell: Instead of considering elements x, y
equal when ==(x, y)
is true, elements are considered equal when equivalence(by(x), by(y))
is true, where by
and equivalence
are functions provided by the user.
julia> v = 1:100000;
julia> tally(v, by = x -> mod(x, 3))
Tally with 100000 items in 3 groups:
[1] | 33334 | 33.334%
[3] | 33333 | 33.333%
[2] | 33333 | 33.333%
julia> v = ["abb", "ba", "aa", "ba", "bbba", "aaab"];
julia> tally(v, equivalence = (x, y) -> first(x) == first(y) && last(x) == last(y))
Tally with 6 items in 3 groups:
[ba] | 3 | 50%
[abb] | 2 | 33%
[aa] | 1 | 17%
(Most of these could be reduced to a tally(map(..., v))
, but there are applications where this is not possible. These are quite frequent when doing algebra/number theory on a computer, but are too involved to reproduce here.)
There is no dedicated documentation, since everything fits neatly in the README.
I have written similar functionality in the past quite often, so I thought I might just turn it into a small package, in the hope that someone else might find it also useful. I am pretty sure that similar functionality exists somewhere else, probably in some statistics package. But I don’t speak statistics and all I want is just to tally.
P.S.: I forgot to mention one more thing. To impress people you can also do an animation using a “lazy” tally: