Hi all,
I was just wondering if any tools already exist for sorting common elements of vectors or sets into bins? For example, for x = [1,2,2,1,3,3,3,3,1]
, a routine that returns that there are 3 ones, 2 twos, and 4 threes.
Cheers,
Colin
Hi all,
I was just wondering if any tools already exist for sorting common elements of vectors or sets into bins? For example, for x = [1,2,2,1,3,3,3,3,1]
, a routine that returns that there are 3 ones, 2 twos, and 4 threes.
Cheers,
Colin
Have a look at fit
and Histogram
in StatsBase.jl:
http://juliastats.github.io/StatsBase.jl/stable/empirical.html#Histograms-1
Example:
julia> fit(Histogram, x, closed=:left, nbins=3).weights
3-element Array{Int64,1}:
3
2
4
x = [1,2,2,1,3,3,3,3,1]
using StatsBase
countmap(x)
even better, learned something
You can also use the FreqTables package, which will be more convenient if you need the result as an array. Finally in StatsBase there’s also the (poorly named) counts
for small integer values.
Brilliant, that was exactly what I was looking for. Thanks.
Colin
Good to know thank you.
Cheers,
Colin
If you have lots of these and they are all smaller than 127 then casting them to UInt8 and countmap
has a fast algorithm to count them.
Interesting. In my current use case I can’t guarantee < 127, but that is useful to know.
Cheers,
Colin
Actually there are fast algorithms for all integers types. Especially fast for U/Int8/16.
Well 16 bit integers is definitely enough. I’ll look into it.
Cheers and thanks,
Colin