Fastest way to multiply a constant sparse matrix to a vector

One answer to this question about multiplying a constant sparse matrix to a static vector made me wonder:

Is there a more performant way of multiplying a constant sparse matrix A to a non-static vector x than doing A * x?

(e.g., for a matrix and vector of large sizes, which do not work well with StaticArrays)

mul!(y,A,x)

But this is just a memory thing, right? Is it different (in speed) to doing y .= A * x?

Yes, A*x allocates y and then does mul!. So it’s faster than using *, but only by the amount it takes to allocate y.

Then there is the copying from the result of A * x to y, as opposed to filling y directly which mul! does.

OK I see thanks!

MKL has some new interesting stuff with the Inspector-executor API (Intel | Data Center Solutions, IoT, and PC Innovation). You give an estimate on how many times you will do an operation and it optimizes the operation for the given matrix.

Out of curiosity, is there any reason that the compiler doesn’t just read y .= A * x to mean mul!(y,A,x)? Is there some case in which the user doesn’t want the same thing to happen? (For the elements of y to reflect A*x.)