Hi everyone,
For serial vectorized FMA operations in the for-loop for SAXPY, one can use
@fastmath @inbounds @simd for i in 1:n
See also this discussion (How to enable vectorized fma instruction for multiply-add vectors?) and “The Julia Language V1.9.0” (see Page 434).
My question is: how to achieve multi-threaded vectorized FMA operations in the for-loop for SAXPY? The code snippet below gives error message:
@threads @fastmath @inbounds @simd for i in 1:n
The error message is:
ERROR: LoadError: ArgumentError: @threads requires a `for` loop expression
Thank you in advance!
PS.
- The solution using
@sync
and@spawn
for such a problem, like SAXPY, is, IMHO, not optimal. - I do not want to prefix every statements in the for-loop body with
@fastmath ...
. Possible for SAXPY, but definitely bad idea for a generic for-loop.