For serial vectorized FMA operations in the for-loop for SAXPY, one can use
@fastmath @inbounds @simd for i in 1:n
See also this discussion (How to enable vectorized fma instruction for multiply-add vectors?) and “The Julia Language V1.9.0” (see Page 434).
My question is: how to achieve multi-threaded vectorized FMA operations in the for-loop for SAXPY? The code snippet below gives error message:
@threads @fastmath @inbounds @simd for i in 1:n
The error message is:
ERROR: LoadError: ArgumentError: @threads requires a `for` loop expression
Thank you in advance!
- The solution using
@spawnfor such a problem, like SAXPY, is, IMHO, not optimal.
- I do not want to prefix every statements in the for-loop body with
@fastmath .... Possible for SAXPY, but definitely bad idea for a generic for-loop.