Dictino
December 20, 2019, 1:20pm
1
I have this function to find an element in an array:
function my_findfirst(element,array)
for i=1:length(array)
if @inbounds (array[i]==element)
return i
end
end
return nothing
end
I’ve created this function to make it faster than findfirst
, and it is already faster for some types:
N=100_000_000
array=zeros(Float64,N)
array[N÷2]=1.0
#first time run to compile
findfirst(isequal(1.0),array)
my_findfirst(1.0,array)
@time findfirst(isequal(1.0),array)
#0.143356 seconds (6 allocations: 192 bytes)
@time my_findfirst(1.0,array)
#0.079413 seconds (5 allocations: 176 bytes)
But for UInt8 is much slower:
N=100_000_000
array=zeros(UInt8,N)
array[N÷2]=UInt8(1)
#first time run to compile
findfirst(isequal(UInt8(1)),array)
my_findfirst(UInt8(1),array)
@time findfirst(isequal(UInt8(1)),array)
#0.008938 seconds (6 allocations: 192 bytes)
@time my_findfirst(UInt8(1),array)
#0.046022 seconds (5 allocations: 176 bytes)
So I see that there is some room to optimize
Could you help me to make it faster please?
Thanks in advance!
Try avoiding global variables in the code you are benchmarking and try to use Benchmarktools for more accurate benchmarks.
1 Like
Don’t really think global or not will matter when the length is 100_000_000
Regarding the question, Base has a special implementation for the specific case of finding UInt8
:
findfirst(pred::Fix2{<:Union{typeof(isequal),typeof(==)},<:Union{Int8,UInt8}}, a::ByteArray) =
nothing_sentinel(_search(a, pred.x))
findnext(pred::Fix2{<:Union{typeof(isequal),typeof(==)},<:Union{Int8,UInt8}}, a::ByteArray, i::Integer) =
nothing_sentinel(_search(a, pred.x, i))
function _search(a::Union{String,ByteArray}, b::Union{Int8,UInt8}, i::Integer = 1)
if i < 1
throw(BoundsError(a, i))
end
n = sizeof(a)
if i > n
return i == n+1 ? 0 : throw(BoundsError(a, i))
end
p = pointer(a)
q = GC.@preserve a ccall(:memchr, Ptr{UInt8}, (Ptr{UInt8}, Int32, Csize_t), p+i-1, b, n-i+1)
return q == C_NULL ? 0 : Int(q-p+1)
end
2 Likes
No it probably won’t matter much in this case, but will show misleading allocations. It’s at least good practice to avoid it.
Dictino
December 20, 2019, 2:09pm
5
Thanks for the advice and the fast response!
The original versión was inside a module and the results are similar.
I Will test with BenchmarkTools also
Dictino
December 20, 2019, 2:15pm
6
Interesting… So possibly memchr is doing some clever low level tricks to be as fast…
I Will investigate
Thanks Kristoffer!
There is a global scope in the module as well. The variables have to be either constant or inside functions for you not to pay the prices of global access.
1 Like
Dictino
December 20, 2019, 2:51pm
8
I missunderstood this point…
Thank you for the clarificarion.
1 Like