# Frequency counts on a square lattice

I have a square array, `c`, with integer entries that I expect to repeat. I would like to get frequency counts on the different indices, as they represent categorical data. I was wondering what might be the most efficient way to do this. I know that I could flatten the array and use `DataFrames.jl`, but I have to do this many times, so I’m concerned about introducing unnecessary overhead through those conversions (square array to flat array to data frame).

Something like this?

``````
julia> using StatsBase

julia> M = rand(1:10, 10, 10)
10×10 Array{Int64,2}:
8  8  6   7   5  9   5  10  5   7
5  1  7   3  10  9   8   4  8   2
2  2  3   9   2  7   9   4  8   7
4  6  8   3   6  2  10   5  3   6
8  7  7   6   3  8   1   4  6   6
6  3  5   5   9  6   7   1  7   5
1  4  7   9   5  8   4   2  5   1
6  8  6   7   3  5   1   2  8  10
6  2  9   7   3  6   7   2  6   2
5  9  9  10   4  2   6   7  9   1

julia> StatsBase.countmap(vec(M))
Dict{Int64,Int64} with 10 entries:
7  => 14
4  => 7
9  => 10
10 => 5
2  => 11
3  => 8
5  => 12
8  => 11
6  => 15
1  => 7
``````
5 Likes

Works for me.

Note that if you know in advance that you have limited set of entries, e.g. values in `1:10`, then you can do much better than `countmap` just by allocating an array of counts and incrementing it as you iterate through your data. For your example above, I get a speedup by more than a factor of 5:

``````julia> function countmap10(M)
counts = zeros(Int, 10)
for x in M
counts[x] += 1
end
return counts
end

julia> @btime StatsBase.countmap(vec(\$M))
628.174 ns (8 allocations: 1.70 KiB)
Dict{Int64,Int64} with 10 entries:
7  => 14
4  => 7
9  => 10
10 => 5
2  => 11
3  => 8
5  => 12
8  => 11
6  => 15
1  => 7

julia> @btime countmap10(\$M)
115.560 ns (1 allocation: 160 bytes)
10-element Array{Int64,1}:
7
11
8
7
12
15
14
11
10
5
``````
4 Likes

should countmap be able to take an optional AbstractArray as possible set?

1 Like

It seems like there should be a `countmap!(counts, array)` function that takes any `counts` object supporting `getindex/setindex!` (e.g. a `Dict` or an array or some other data structure).

1 Like

If you have too many counts to stick into memory I really recommend OnlineStats.jl’s countmap :). https://github.com/joshday/OnlineStats.jl

https://joshday.github.io/OnlineStats.jl/latest/api/#OnlineStatsBase.CountMap

1 Like