Very long time for addition of complex identity matrix to transposed sparse matrix

rveltz · May 30, 2025, 10:18am

Hi,

I am surprised by these timings on 1.10. Can it be improved?

using SparseArrays, LinearAlgebra
A = sprandn(90000,90000, 1e-5)
@time (A + complex(0,1.) * I);  # 0.001344 seconds (13 allocations: 7.342 MiB)
@time (transpose(A) + complex(0,1.) * I); #101.973608 seconds (193.74 k allocations: 21.416 MiB, 0.08% compilation time)

stevengj · May 30, 2025, 12:41pm

A + x*I is hitting a specialized method, whereas A' + x*I hits a generic method that mutates one element of the diagonal at a time, which is very slow for sparse matrices.

Should be easily fixable by adding some specialized methods, e.g.

import Base: +, -
using LinearAlgebra: AdjOrTrans, wrapperop
(+)(A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}, J::UniformScaling{<:Number}) =
    wrapperop(A)(parent(A) + wrapperop(A)(J))
(-)(A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}, J::UniformScaling{<:Number}) =
    wrapperop(A)(parent(A) - wrapperop(A)(J))
(+)(J::UniformScaling{<:Number}, A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}) =
    wrapperop(A)(wrapperop(A)(J) + parent(A))
(-)(J::UniformScaling{<:Number}, A::AdjOrTrans{<:Any, <:AbstractSparseMatrix}) =
    wrapperop(A)(wrapperop(A)(J) - parent(A))

Should be an easy PR for someone?

jishnub · May 30, 2025, 12:43pm

github.com/JuliaLang/LinearAlgebra.jl

Faster `Adjoint`/`Transpose`-`UniformScaling` arithmetic

master ← jishnub/uniformscaling_add

opened 12:42PM - 30 May 25 UTC

jishnub

+29 -0

Avoids hitting slow generic methods. Discussed in https://discourse.julialang.o…rg/t/very-long-time-for-addition-of-complex-identity-matrix-to-transposed-sparse-matrix/129465. After this, the following are comparable: ```julia julia> A = sprandn(90, 90, 0.01); julia> @btime ($A + complex(0,1.) * I); 1.313 μs (13 allocations: 8.02 KiB) julia> @btime (transpose($A) + complex(0,1.) * I); 1.416 μs (13 allocations: 8.02 KiB) ```

rveltz · May 31, 2025, 7:32am

Is it a similar issue for this one?

A = sprandn(9000,9000, 1e-5)
@time (A + complex(0,1.) * I);  # 0.000110 seconds (13 allocations: 581.859 KiB)
A = @view sprandn(9000,9000, 1e-5)[1:end-1, 1:end-1]
@time (A + complex(0,1.) * I);  # 0.351283 seconds (13 allocations: 539.094 KiB)

gdalle · May 31, 2025, 7:47am

I think this one is just a version of “views on sparse arrays suck”

rveltz · May 31, 2025, 7:59am

It took me a long time to understand why my bifurcation code for some PDE was really slow
This tread shows why. Do you know of a package which solves this?

gdalle · May 31, 2025, 8:01am

I think the problem with the current sparse view is that it creates a wrapper where you can access individual matrix elements, but these accesses are very slow for sparse matrices (each one requires a searchsorted), and other operations beyond getindex(A, i, j) are not well-optimized.

As a result, whenever I need a view of a sparse matrix, I build it myself using the fields of the SparseMatrixCSC type. Once you get the hang of it, it’s not overly difficult.

rveltz · May 31, 2025, 8:13am

In BifurcationKit, I have in many places @view A .... where A is an AbstractMatrix.
Basically, I do a lot of sigma * I + B where B can be a view to a sparse matrix. Maybe I should code this differently since it allocates anyway

gdalle · May 31, 2025, 2:52pm

Can you explain a little bit more about the context in which you need these views?

rveltz · May 31, 2025, 4:56pm

It happened during fold continuation as in here. I need to form the matrix M_f as jacobian for newton iterations. Later, I need to compute the spectrum of dF so I form dF = @view Mf[1:end-1,1:end-1] and compute the spectrum. Unfortunately, dF is not well conditioned, so I use a Shift-inverse strategy which requires to precompute factorize(sigma * I - dF). In a nutshell, all the problems of this post appears.

I know I dont need to form the bordered system to solve Mf, I could use a bordering strategy, it just happens to be slower in my use case.

If I were doing Hopf continuation, it would be the same. Actually, for PD/NS continuation (see here) it would be similar.

gdalle · May 31, 2025, 5:20pm

Maybe you could look at it the other way, and consider the Mf[1:end-1,1:end-1] matrix to be your starting point, then either create a bigger block matrix or perform a manual solve for the system of the Newton iteration? Idk if there are formulas for solving \begin{pmatrix} M & u \\ v^T & 0 \end{pmatrix} = \begin{pmatrix} a \\ b \end{pmatrix} directly?

Alternately you could keep both M and its principal submatrix in storage, and perform the update manually with a dedicated version in case M is sparse?

Topic		Replies	Views
Operation on transpose sparse array is slow Performance	1	248	May 15, 2024
Slow multiplication of transpose of sparse matrix Performance	5	939	April 27, 2022
Sparse transpose General Usage	8	2434	August 3, 2018
Speed up sparse matrix multiplication Performance	8	1744	August 31, 2020
Odd behaviour with kron, transpose and sparse matrices General Usage	2	891	December 2, 2018

Very long time for addition of complex identity matrix to transposed sparse matrix

Related topics