Very long time for addition of complex identity matrix to transposed sparse matrix

Hi,

I am surprised by these timings on 1.10. Can it be improved?

using SparseArrays, LinearAlgebra
A = sprandn(90000,90000, 1e-5)
@time (A + complex(0,1.) * I);  # 0.001344 seconds (13 allocations: 7.342 MiB)
@time (transpose(A) + complex(0,1.) * I); #101.973608 seconds (193.74 k allocations: 21.416 MiB, 0.08% compilation time)
2 Likes

A + x*I is hitting a specialized method, whereas A' + x*I hits a generic method that mutates one element of the diagonal at a time, which is very slow for sparse matrices.

Should be easily fixable by adding some specialized methods, e.g.

import Base: +, -
using LinearAlgebra: AdjOrTrans, wrapperop
(+)(A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}, J::UniformScaling{<:Number}) =
    wrapperop(A)(parent(A) + wrapperop(A)(J))
(-)(A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}, J::UniformScaling{<:Number}) =
    wrapperop(A)(parent(A) - wrapperop(A)(J))
(+)(J::UniformScaling{<:Number}, A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}) =
    wrapperop(A)(wrapperop(A)(J) + parent(A))
(-)(J::UniformScaling{<:Number}, A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}) =
    wrapperop(A)(wrapperop(A)(J) - parent(A))

Should be an easy PR for someone?

2 Likes
6 Likes

Is it a similar issue for this one?

A = sprandn(9000,9000, 1e-5)
@time (A + complex(0,1.) * I);  # 0.000110 seconds (13 allocations: 581.859 KiB)
A = @view sprandn(9000,9000, 1e-5)[1:end-1, 1:end-1]
@time (A + complex(0,1.) * I);  # 0.351283 seconds (13 allocations: 539.094 KiB)

I think this one is just a version of “views on sparse arrays suck”

It took me a long time to understand why my bifurcation code for some PDE was really slow :frowning:
This tread shows why. Do you know of a package which solves this?

I think the problem with the current sparse view is that it creates a wrapper where you can access individual matrix elements, but these accesses are very slow for sparse matrices (each one requires a searchsorted), and other operations beyond getindex(A, i, j) are not well-optimized.

As a result, whenever I need a view of a sparse matrix, I build it myself using the fields of the SparseMatrixCSC type. Once you get the hang of it, it’s not overly difficult.

In BifurcationKit, I have in many places @view A .... where A is an AbstractMatrix.
Basically, I do a lot of sigma * I + B where B can be a view to a sparse matrix. Maybe I should code this differently since it allocates anyway

Can you explain a little bit more about the context in which you need these views?

It happened during fold continuation as in here. I need to form the matrix M_f as jacobian for newton iterations. Later, I need to compute the spectrum of dF so I form dF = @view Mf[1:end-1,1:end-1] and compute the spectrum. Unfortunately, dF is not well conditioned, so I use a Shift-inverse strategy which requires to precompute factorize(sigma * I - dF). In a nutshell, all the problems of this post appears.

I know I dont need to form the bordered system to solve Mf, I could use a bordering strategy, it just happens to be slower in my use case.

If I were doing Hopf continuation, it would be the same. Actually, for PD/NS continuation (see here) it would be similar.

Maybe you could look at it the other way, and consider the Mf[1:end-1,1:end-1] matrix to be your starting point, then either create a bigger block matrix or perform a manual solve for the system of the Newton iteration? Idk if there are formulas for solving \begin{pmatrix} M & u \\ v^T & 0 \end{pmatrix} = \begin{pmatrix} a \\ b \end{pmatrix} directly?

Alternately you could keep both M and its principal submatrix in storage, and perform the update manually with a dedicated version in case M is sparse?