# Optimizing counting number of occurrences of a given number in an array

Is there a way to optimize the counter function? I’ve tried using LoopVectorization it complained saying `ERROR: LoadError: Don't know how to handle expression. if i == element_of_interest `.

``````arr = rand(1:5, 100)

given = 4

@btime countmap(\$arr)
# 684.667 ns (6 allocations: 1.50 KiB)

function counter(x, element_of_interest)
count = 0
for i in x
if i == element_of_interest
count+=1
end
end
return count
end

@btime counter(\$arr, \$given)
# 17.635 ns (0 allocations: 0 bytes)
``````

Try without if, using count += i==eoi.

2 Likes
``````
function counter_turbo(x, element_of_interest)
count = 0
@turbo for i in eachindex(x)
count+= x[i]==element_of_interest
end
return count
end

@btime counter_turbo(\$arr, \$4)
# 17.635 ns (0 allocations: 0 bytes)
``````

Timings of both `counter` and `counter_turbo` are eerily same. Upon checking `@code_native` of `counter` it seems the compiler has vectorized it.

1 Like

Also,

``````count(==(given), arr)
``````

is around the same speed and is a standard library function.

5 Likes

TIL.

Everyday I learn something new. Thank you.

Speaking of things learned, I want to remind myself here (even to myself) of an algorithm discovered (by Dan?) in the DataFrames code, during a discussion (not long ago) about the efficiency of the groupby() function.

Applying it to the countmap case always gives excellent results

``````
function countmap2(v)
m,M =extrema(v)
res=fill(0,M-m+1)
for i in eachindex(v)
res[v[i]-m+1]+=1
end
Pair.(m:M,res)
end

julia> arr
15-element Vector{Int64}:
5
0
3
1
-1
4
-1
-3
-3
-3
-2
-3
2
1
2

julia> @btime countmap2(\$arr)
62.653 ns (2 allocations: 336 bytes)
9-element Vector{Pair{Int64, Int64}}:
-3 => 4
-2 => 1
-1 => 2
0 => 1
1 => 2
2 => 2
3 => 1
4 => 1
5 => 1

julia>

julia> @btime countmap(\$arr)
222.154 ns (5 allocations: 720 bytes)
Dict{Int64, Int64} with 9 entries:
0  => 1
4  => 1
5  => 1
-1 => 2
2  => 2
-3 => 4
-2 => 1
3  => 1
1  => 2

``````
1 Like

How well does it work on `arr = [0, 1000, 10^14]`?

How, if any, can one find an optimal value for ts?

``````function countmap3(v, ts)
m,M =extrema(v)
if (M-m)/length(v)>ts
countmap(v)
else
res=fill(0,M-m+1)
for i in eachindex(v)
res[v[i]-m+1]+=1
end
return Pair.(m:M,res)
end
end
``````

Perhaps the strategy used to establish when it is better to use a sparse array instead of a dense array could be applicable in this case too?

Calculating the `extrema` requires scanning through `v`, and isn’t so cheap to begin with. There are tradeoffs. `countmap` should have good optimization effort invested in it, as it is ubiquitous.

It’s a good method if you know an upper bound on the range of values, for example if the values are of type `Int8`, or represent some limited physical or logical quantity.

But, wanting to use the Dataframes “trick”, I don’t see any alternatives.