Counting number of occurences in an array

Hey people,

I’m struggling with the following problem:

I want to count the number of occurances for the values in an array, s.t. I can plot the result.
I.e.

this is the array: A = [1,2,3,3,4,4,4,6]

and I want to find a way to get something out like: B = [ 1 1, 2 1, 3 2, 4 3, 6 1] a two dimensional array.

I tried it with loops and if conditions, but nothing has workes properly.

Thanks !

1 Like

Something like this?

A = [1,2,3,3,4,4,4,6]
B = [(i, count(==(i), A)) for i in unique(A)]

Edit: Sorry, that’s not a two dimensional array…

B = hcat([[i, count(==(i), A)] for i in unique(A)]...)
3 Likes

Hi! An efficient way to do this is to use the counter function from DataStructures:

julia> using DataStructures

julia> c = counter([1,2,3,3,4,4,4,6])
Accumulator{Int64,Int64} with 5 entries:
  4 => 3
  2 => 1
  3 => 2
  6 => 1
  1 => 1

you can then use the keys(c) and values(c) to plot the number of occurrences.

12 Likes

Thank u, Chris!
This was actually what I’ve been looking for.
I still didn’t manage it perfectly to plot keys and values, since they are dict-types, but I’m on it!

You’re welcome. You may have to collect the keys and values? I’ve found that some plotting libraries get upset if you pass them iterables and it can be necessary to create an array instead.

with collect it works, but the plot won’t give me out a proper histogram…
I’m working with juliabox and tryed to make use of Plotly, but somehow that doesn’t work neither …

It feels like, need to spend ages 'till I get comfortable with Julia lol

Oh, if you want a histogram, you want a plotting tool designed for histograms. I’m not sure about plotly, but in PyPlot it’s hist.

IIRC StatsBase.countmap is faster

3 Likes

Thanks, it looks like countmap is a good option. I’ve been wanting a stdlib function for this for a while, so it will be nice to get it once StatsBase functionality (presumably) gets merged into Statistics.

Elsewhere @andyferris has just pointed out to me that SplitApplyCombine has a much more general version of this operation as groupreduce. And there’s a specific special case called groupcount which does what you want.

It seems groupcount and countmap are basically the same, though countmap offers some algorithm choices.

2 Likes

Good point, I’ll work it out - thank you!