Operation on transpose sparse array is slow

tiZ · May 15, 2024, 11:18pm

Hi Friends, I am searching for a better solution for speed up my script, my script accept a sparse matrix as input, (some time the sparse matrix is sample X features, and sometimes it is features X sample), I have to transpose it to make sure sample is always on column.

But I found that some operation on transpose sparse matrix will be much slow, so I wonder could it have a better solution?

X = read_mtx("file path")
# 232320×32840 SparseMatrixCSC{Float32, Int64}
# sample x feature

idx_1 = rand(Bool, 232320) # condition select by sample
idx_2 = rand(Bool, 32840) # condition select by features

@time mean(X[idx_1,idx_2])
# 3.291982 seconds (18 allocations: 1.532 GiB, 3.49% gc time)

But if I index on the transpose it will much slow, actually I wait a looong time but it is still running…

Y = X' 
# 32840×232320 LinearAlgebra.Adjoint{Float32, SparseMatrixCSC{Float32, Int64}}
# now it is  feature x sample

@time mean(Y[idx_2,idx_1])
# looooooooog time waiting and no return...

of course I can use view:

@time @views mean(X[idx_1,idx_2]);
# 30.983588 seconds (10 allocations: 1.264 MiB)

@time  @views mean(Y[idx_2,idx_1]) 
# looooooooog time waiting and no return...

I have no idea how to improve it, and I would greatly appreciate any advice!
Thank you!!!

stevengj · May 15, 2024, 11:50pm

Transposing a sparse matrix with Y = X' doesn’t actually copy any data, it wraps it with a special Adjoint type that gives a (conjugate-)transposed view of the underlying X matrix.

Adjoint views of sparse matrices have several optimized methods, e.g. Y * vector is fast, but not all of the optimized methods for sparse matrices are implemented for the Adjoint wrapper. In particular, “logical indexing” with a vector of booleans has a fast implementation for sparse matrices, but not for Adjoint wrappers thereof, and in consequence the corresponding operation on Y falls back to the generic method for dense matrices, which is very slow because it does not exploit sparsity.

Three possible workarounds:

Set Y = SparseMatrixCSC(X') — this makes an explicit sparse-matrix copy of the Adjoint wrapper, which uses up some extra memory but then subsequent operations will be fast.
Re-organize your code so that you do your calculations directly on the un-transposed matrix X
Define the missing specialized methods for Adjoint wrappers of sparse matrices. (This is “type piracy” but may be a reasonable stopgap measure in the short run.)

For example, your code above should become fast again if we define:

using LinearAlgebra, SparseArrays
Base.getindex(A::Adjoint{<:Any,<:SparseArrays.AbstractSparseMatrixCSC}, I::AbstractVector{Bool}, J::AbstractVector{Bool}) = parent(A)[findall(J),findall(I)]'

If someone wants to volunteer to make a PR to SparseArrays.jl with this and similar enhancements, that would be great. See also getindex is very slow for adjoints of sparse arrays · Issue #36 · JuliaSparse/SparseArrays.jl · GitHub and Strange performance with Adjoint structures

Topic		Replies	Views
Slow multiplication of transpose of sparse matrix Performance	5	945	April 27, 2022
Sparse transpose General Usage	8	2436	August 3, 2018
Very long time for addition of complex identity matrix to transposed sparse matrix Performance question , sparsearrays	10	190	May 31, 2025
Strange performance with Adjoint structures Performance	7	754	August 14, 2018
Trouble with sparse matrix transpose in 0.7 (and now 1.0 !?) General Usage	13	1600	August 10, 2018

Operation on transpose sparse array is slow

Related topics