Just looking at the ScanByte package by @jakobnissen (thanks for giving me ideas! ). It’s written to find the first occurrence of a given byte in a byte array. Just as a test I thought I can also call memchr
iteratively to find all matches in an array like so:
(P.s. never used ccall
before this so let me know if I messed something up)
@inline function byte_scan(mem::Vector{UInt8}, byte::UInt8)
c = 0
@GC.preserve begin
mem_start::Ptr{UInt8} = pointer(mem)
mem_length = length(mem)
actual_index::Int64 = 0
while mem_length > 1
pos = @ccall memchr(mem_start::Ptr{UInt8}, byte::Cint, mem_length::Csize_t)::Ptr{Cchar}
if pos == C_NULL
return c
else
mem_start = pos + 1 # not sure how to fix this type instability)
actual_index = ((pos - pointer(mem)) + 1) % Int64
mem_length = length(mem) - actual_index
c += actual_index
end
end
end
return c
end
(Summing them doesn’t make much sense but just to do something with the returned indexes)
Which would be similar to searching it with a loop in base julia:
@inline function base_scan(mem::Vector{UInt8}, byte::UInt8)
c = 0
@inbounds for i in eachindex(mem)
if mem[i] == byte
c += i
end
end
return c
end
Using memchr
is much faster on my PC:
function test()
target = 0x41
Random.seed!(3)
arr = rand(UInt8, 10_000_000)
# Search 'A' in the random char array
@btime byte_scan($arr, $target)
@btime base_scan($arr, $target)
@assert byte_scan(arr, target) == base_scan(arr, target)
end
1.161 ms (0 allocations: 0 bytes)
13.498 ms (0 allocations: 0 bytes)
Will this have any major drawbacks that I’m missing or could I just swap this memchr
in?