Sometimes when I use LoopVectorization, I want to have both a single and multi-threaded variant of the function available. For instance:
using LoopVectorization
function add!(out, x, y; thread)
if thread
add_multi_thread!(out, x, y)
else
add_single_thread!(out, x, y)
end
end
function add_single_thread!(out, x, y)
@turbo thread=false for i in eachindex(out, x, y)
out[i] = x[i] + y[i]
end
out
end
function add_multi_thread!(out, x, y)
@turbo thread=true for i in eachindex(out, x, y)
out[i] = x[i] + y[i]
end
out
end
x = randn(10)
y = randn(10)
out1 = randn(10)
out2 = randn(10)
add!(out1, x, y, thread=true)
add!(out2, x, y, thread=false)
@assert out1 ≈ out2
What is a good way to reduce code duplication here? Should I use@eval
? What if I also want to control the number of threads at runtime, would I need to use @generated
?