Extremly slow matrix multiplication with Hermitian(sparse())

Hello everyone,
I have a large sparse hermitian matrix in my code but the matrix multiplication is taking absolutely forever. Is it a bug ?

M = rand(2000,2000);
C = rand(2000,2000);
@time M*C;  
@time M*Hermitian(C);
@time M*sparse(C);
@time M*Hermitian(sparse(C));

outputs :

  0.049050 seconds (3 allocations: 30.518 MiB, 6.76% gc time)
  0.049359 seconds (4 allocations: 30.518 MiB, 4.22% gc time)
  1.038535 seconds (13 allocations: 91.568 MiB, 8.05% gc time)
  174.115998 seconds (25 allocations: 222.766 MiB, 0.05% gc time)

of course in this MWE C is neither Hermitian nor sparse, but my matrix was, and the times where similar.

Hi @tripo,
There are several issues with your benchmark, so let’s get these out of the way first:

  • the matrices you provide aren’t actually sparse, which means that the sparse storage will be very suboptimal
  • you are benchmarking the construction of a sparse object from a dense matrix, which is also very inefficient
  • you’re only benchmarking one run, which might be fairly noisy

Here’s a slightly improved benchmarking code, but I agree that the performance gap is still egregious:

using BenchmarkTools, LinearAlgebra, SparseArrays
M = rand(2000, 2000);
C = sprand(2000, 2000, 0.01);  # set a realistic sparsity level
Ch = Hermitian(C);
@btime $M * $C;  # 18.227 ms (3 allocations: 30.53 MiB)
@btime $M * $Ch;  # 2.724 s (41 allocations: 169.86 MiB)
2 Likes

I would like to chime in my experience. For my applications it turned out that Julia’s non-sparse algorithms are very good (especially tricks with @view, mul!(), div!(), etc.) that I couldn’t beat it with any sparse implementations.

Remember, that apart from doing the sparse calculations your code probably has also other things going on (potentially a lot of them) that require also converting to and from for those calculations, which also takes time.

Here is a lengthy topic of mine which, deja vu, I was again navigated through by the generous help and time of @gdalle.

1 Like

I gave it a quick look and it seems the multiplication dense times Hermitian sparse uses a non-optimized code path (which is not aware of sparsity).
Here’s a quick workaround, which is worth it if you reuse the matrix C a couple of times: just use SparseMatrixCSC, e.g. make it Hermitian but then go back to the native sparse format, which is optimized better than “Hermitian-of-sparse”:

Chs = sparse(Hermitian(C))

The Hermitian wrapper shouldn’t even be necessary if you know that your input is Hermitian anyway. It is mostly used when a specific algorithm must be chosen for e.g. factorization, but here on a simple matmul it won’t do much of a difference.

3 Likes