Sorting common elements into bins

Hi all,

I was just wondering if any tools already exist for sorting common elements of vectors or sets into bins? For example, for x = [1,2,2,1,3,3,3,3,1], a routine that returns that there are 3 ones, 2 twos, and 4 threes.

Cheers,

Colin

Have a look at fit and Histogram in StatsBase.jl:

http://juliastats.github.io/StatsBase.jl/stable/empirical.html#Histograms-1

Example:

julia> fit(Histogram, x, closed=:left, nbins=3).weights
3-element Array{Int64,1}:
 3
 2
 4
x = [1,2,2,1,3,3,3,3,1]
using StatsBase
countmap(x)
5 Likes

even better, learned something :slight_smile:

1 Like

You can also use the FreqTables package, which will be more convenient if you need the result as an array. Finally in StatsBase there’s also the (poorly named) counts for small integer values.

1 Like

Brilliant, that was exactly what I was looking for. Thanks.

Colin

Good to know thank you.

Cheers,

Colin

If you have lots of these and they are all smaller than 127 then casting them to UInt8 and countmap has a fast algorithm to count them.

Interesting. In my current use case I can’t guarantee < 127, but that is useful to know.

Cheers,

Colin

Actually there are fast algorithms for all integers types. Especially fast for U/Int8/16.

Well 16 bit integers is definitely enough. I’ll look into it.

Cheers and thanks,

Colin