Minimum Mode Problem: minimum number - maximum repetition

If you can preallocate an array with a size large enough (with at least a number of elements equal to the maximum value of the vector), probably this is close to the fastest you can get:

julia> function findn(x,y=zeros(Int,maximum(x)))
           m = 0
           fill!(y,0)
           @inbounds for i in x
                y[i] += 1
                m = max(y[i],m)
           end
           return findfirst(isequal(m),y)
       end
findn (generic function with 2 methods)

julia> x = [1, 3, 3, 2, 3, 4];

julia> y = zeros(Int,maximum(x))

julia> @btime findn($x,$y)
  12.373 ns (0 allocations: 0 bytes)
3

If you don’t want to preallocate that array, you can use that same function with:

julia> @btime findn($x)
  36.959 ns (1 allocation: 112 bytes)
3

If maximum(x) is too large for an array be allocated with that size, than I think something like this is ok:

julia> function findn(x)
           u = sort!(unique(x))
           y = zeros(Int,length(u))
           m = 0
           @inbounds for i in x
                iy = searchsortedfirst(u,i)
                y[iy] += 1
                m = max(y[iy],m)
           end
           return findfirst(isequal(m),y)
       end
findn (generic function with 2 methods)

julia> @btime findn($x)
  263.385 ns (7 allocations: 720 bytes)
3

Edit: actually this last option is not fast for larger arrays compared to @jling suggestion (likely because searchsortedfirst becomes slower than fetching dictionary elements for large arrays), but if you can afford allocating an array with the size maximum(x) the other option is much faster:

julia> using StatsBase, DataStructures

julia> f(x) = last(sort!(OrderedDict(countmap(x)); byvalue = true).keys) #jling
f (generic function with 1 method)

julia> function findn(x,y=Vector{Int}(undef,maximum(x)))
           m = 0
           fill!(y,0)
           @inbounds for i in x
                y[i] += 1
                m = max(y[i],m)
           end
           return findfirst(isequal(m),y)
       end
findn (generic function with 2 methods)

julia> x = rand(1:100,1000);

julia> using BenchmarkTools

julia> @btime f($x)
  8.240 μs (36 allocations: 25.59 KiB)
23

julia> @btime findn($x)
  908.548 ns (1 allocation: 896 bytes)
23

julia> y = zeros(Int,100);

julia> @btime findn($x,$y)
  581.709 ns (0 allocations: 0 bytes)
23


2 Likes