Can’t tell much difference between the two. Any insights? Thanks.
I don’t understand the comparison.
axpy! is for vectors, while
muladd is for scalars, although of course you can do
x .= muladd.(x,y,z) to apply it to vectors similar to
I would say that 99.9% of code should not be calling low-level BLAS functions directly. If a BLAS-1 function like
axpy! is performance-critical for you, you probably need to re-think your code anyway.
Oh, sorry. Just realized that
muladd is for scalars. What is the high-level surrogate for ‘BLAS.axpy!’ then? Can you give me a pointer?
Why is calling low-level BLAS functions deemed a bad idea? Sorry if my question sounds stupid… Really appreciate!
Low-level BLAS calls usually are memory-bound and not compute-bound, so you’ll find that using low-level BLAS usually doesn’t even give a performance advantage over Julia (that’s not true of high-level BLAS though).
muladd is generic, can fuse, and will be FMA on processors which it should, so it’s a great option here.
Thanks for the explanation! Get it now.
I thought BLAS would give threading for free as compared to broadcasting
muladd. Am I wrong?