Optimizing counting number of occurrences of a given number in an array

Is there a way to optimize the counter function? I’ve tried using LoopVectorization it complained saying ERROR: LoadError: Don't know how to handle expression. if i == element_of_interest .

arr = rand(1:5, 100)

given = 4

@btime countmap($arr)
# 684.667 ns (6 allocations: 1.50 KiB)   

function counter(x, element_of_interest)
    count = 0
    for i in x
        if i == element_of_interest
            count+=1
        end
    end
    return count
end

@btime counter($arr, $given)
# 17.635 ns (0 allocations: 0 bytes)

Try without if, using count += i==eoi.

2 Likes

function counter_turbo(x, element_of_interest)
       count = 0
       @turbo for i in eachindex(x)
           count+= x[i]==element_of_interest
       end
       return count
end

@btime counter_turbo($arr, $4)
# 17.635 ns (0 allocations: 0 bytes)

Timings of both counter and counter_turbo are eerily same. Upon checking @code_native of counter it seems the compiler has vectorized it.

1 Like

Also,

count(==(given), arr)

is around the same speed and is a standard library function.

5 Likes

TIL.

Everyday I learn something new. Thank you.

Speaking of things learned, I want to remind myself here (even to myself) of an algorithm discovered (by Dan?) in the DataFrames code, during a discussion (not long ago) about the efficiency of the groupby() function.

Applying it to the countmap case always gives excellent results


function countmap2(v)
    m,M =extrema(v)
    res=fill(0,M-m+1)
    for i in eachindex(v)
        res[v[i]-m+1]+=1
    end
    Pair.(m:M,res)
end

julia> arr
15-element Vector{Int64}:
  5
  0
  3
  1
 -1
  4
 -1
 -3
 -3
 -3
 -2
 -3
  2
  1
  2

julia> @btime countmap2($arr)
  62.653 ns (2 allocations: 336 bytes)
9-element Vector{Pair{Int64, Int64}}:
 -3 => 4
 -2 => 1
 -1 => 2
  0 => 1
  1 => 2
  2 => 2
  3 => 1
  4 => 1
  5 => 1

julia> 

julia> @btime countmap($arr)
  222.154 ns (5 allocations: 720 bytes)
Dict{Int64, Int64} with 9 entries:
  0  => 1
  4  => 1
  5  => 1
  -1 => 2
  2  => 2
  -3 => 4
  -2 => 1
  3  => 1
  1  => 2

1 Like

How well does it work on arr = [0, 1000, 10^14]? :wink:

How, if any, can one find an optimal value for ts?

function countmap3(v, ts)
    m,M =extrema(v)
    if (M-m)/length(v)>ts
        countmap(v)
    else
        res=fill(0,M-m+1)
        for i in eachindex(v)
            res[v[i]-m+1]+=1
        end
    return Pair.(m:M,res)
    end
end

Perhaps the strategy used to establish when it is better to use a sparse array instead of a dense array could be applicable in this case too?

Calculating the extrema requires scanning through v, and isn’t so cheap to begin with. There are tradeoffs. countmap should have good optimization effort invested in it, as it is ubiquitous.

It’s a good method if you know an upper bound on the range of values, for example if the values are of type Int8, or represent some limited physical or logical quantity.

But, wanting to use the Dataframes “trick”, I don’t see any alternatives.