Sparse transpose


#1

In regard to 0.7.0-beta2, the transpose of a SparseMatrixCSC is no longer a SparseMatrixCSC; instead it is some other object. How do I convert it to SparseMatrixCSC? Below are a couple of obvious things I tried. Both work fine for small matrices, but both take a really long time on big matrices, so I guess there must be a better way? I suppose I could manually pick apart the is,js,es, but that wasn’t necessary in 0.6.

julia> import SparseArrays.sparse

julia> import SparseArrays.spzeros

julia> x = spzeros(10000,10000)
10000×10000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> y = sparse(x')
10000×10000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> x = spzeros(100000,100000)
100000×100000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> y = sparse(x')
100000×100000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> import SparseArrays.SparseMatrixCSC

julia> x = spzeros(100000,100000)
100000×100000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> y = convert(SparseMatrixCSC{Float64,Int}, x')
100000×100000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

#2
julia> @time SparseMatrixCSC(x')
  0.001013 seconds (11 allocations: 781.797 KiB)
100000×100000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> @time copy(x')
  0.001040 seconds (11 allocations: 781.797 KiB)
100000×100000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

Seems like there is some problem with dispatch that makes the explicit parameter constructor slow.


#3

Here’s another sparse-transpose performance problem/regression. (The first hcat call is fast, the second is really slow.) EDIT: I can fix the second invocation to hcat by replacing y' with SparseMatrixCSC(y').

julia> import SparseArrays.spzeros

julia> x = spzeros(100000,100000)
100000×100000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> y = spzeros(100000,100000)
100000×100000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> m = hcat(x,y)
100000×200000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> n = hcat(x,y')
100000×200000 SparseArrays.SparseMatrixCSC{Float64,Int64} with 0 stored entries

#4

Yes, lazy adjoint creates a lot more methods that need to be dispatched to optimized versions. Failing to do so will lead to slow execution (brutally slow for very sparse matrices). The curse of the correct but slow fallback. You can always materialize the adjoint though which is what you do when you add SparseMatrixCSC (or copy).


#5

In my opinion, this tradeoff has been decided in the wrong way. (I encountered related problems and bugs when porting my code from 0.5 to 0.6 last year.) Sparse matrix methods should never silently fall back to dense matrix methods unless the user explicitly calls convert-to-dense. For other aspects of Julia, correct-but-slow may be acceptable, but in the case of sparse matrices, the whole point of using them is performance!

Should I open an issue?


#6

About what? To make SparseMatrixCSC not an AbstractMatrix? To remove getindex from it?


#7

I meant: an issue asking someone to provide the missing methods.


#8

Yes, that is definitely a good idea.


#9

Done.