I’d like to properly vectorize this:
"""
issuffix(u, v)
Check if `u` is a suffix of `v`
"""
function issuffix(u::AbstractVector, v::AbstractVector)
lu = length(u)
lu ≤ length(v) || return false
@inbounds for i in eachindex(u)
u[i] == v[end-lu+i] || return false
end
return true
end
So I wrote this version:
function issuffix_simd(u::AbstractVector, v::AbstractVector)
lu = length(u)
lu ≤ length(v) || return false
ans = true
@inbounds for i in eachindex(u)
ans &= u[i] == v[end-lu+i]
end
return ans
end
Which is much faster when u
is a suffix of v
, but processes the whole vector even though ans
become false
at i=1
. Here’s a testbench:
using Random
Random.seed!(1)
u8 = rand(Int8.(1:10), 2^10+2^9+1);
v8 = rand(Int8.(1:10), 2^10+2^9+1);
u16 = Vector{UInt16}(u8)
v16 = Vector{UInt16}(v8)
for (u,v) = ((u8,v8), (u16, v16))
@info typeof(u), typeof(v)
@assert issuffix(u, u)
@assert !issuffix(u, v)
@info "issuffix: true"
@btime issuffix($u, $u)
@info "issuffix: false"
@btime issuffix($u, $v)
@assert issuffix_simd(u, u)
@assert !issuffix_simd(u, v)
@info "issuffix_simd: true"
@btime issuffix_simd($u, $u)
@info "issuffix_simd: false"
@btime issuffix_simd($u, $v)
end
which gives me
[ Info: (Vector{Int8}, Vector{Int8})
[ Info: issuffix: true
375.439 ns (0 allocations: 0 bytes)
[ Info: issuffix: false
2.375 ns (0 allocations: 0 bytes)
[ Info: issuffix_simd: true
19.264 ns (0 allocations: 0 bytes)
[ Info: issuffix_simd: false
19.194 ns (0 allocations: 0 bytes)
[ Info: (Vector{UInt16}, Vector{UInt16})
[ Info: issuffix: true
377.684 ns (0 allocations: 0 bytes)
[ Info: issuffix: false
2.864 ns (0 allocations: 0 bytes)
[ Info: issuffix_simd: true
36.470 ns (0 allocations: 0 bytes)
[ Info: issuffix_simd: false
32.949 ns (0 allocations: 0 bytes)
First of all:
-
simd
ed version is seriously fast at this workload (I expected something 4×faster) - But: is there any way to have a cookie and eat it: vectorized version which also returns early?