Slow arithmetic on views of sparse matrices


#1

I was profiling some code that was surprisingly slow, and it turned out that the source of the issue was a sum over a view into a large sparse matrix. Here is an MWE that recreates it:

d = sprand(Bool,10000,10000, 0.01)
e = view(d, rand(1:10000,5000), rand(1:10000,9000))

using BenchmarkTools
@benchmark sum($d, 1)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     421.354 μs (0.00% GC)
  median time:      450.828 μs (0.00% GC)
  mean time:        463.575 μs (0.00% GC)
  maximum time:     1.226 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1


@benchmark sum($e, 1)
BenchmarkTools.Trial:
  memory estimate:  585.06 KiB
  allocs estimate:  51
  --------------
  minimum time:     3.286 s (0.00% GC)
  median time:      3.312 s (0.00% GC)
  mean time:        3.312 s (0.00% GC)
  maximum time:     3.339 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

That’s a slowdown by a factor of 7000.

Is this 1) expected behaviour and I’m best off not using views into sparse matrices?; 2) am I doing something wrong?; or 3) is it a bug?

EDIT: f = view(d, :, :) has the same issue.

Thanks!


#2

I would think this is expected: The sum of a sparse CSC matrix is trivial: just sum the internal vector storing the values of non-zeros. Your sum of the view on the other hand, uses the generic sum function: sum(A::AbstractArray, region) at reducedim.jl:320. This will look up each index to sum it, which for a CSC sparse matrix is two pointer indirections, most of which will lead to “zero”. Thus 7000x slowdown seems about right. I suspect to make a specialized sum function specialized to random access sparse matrix will be hard and not much more efficient (although I might be wrong).


#3

Depending on what your real use case is, if you’re trying to sum elements only within some contiguous subrange of rows and columns, it wouldn’t be hard to write a specialized routine for SparseMatrixCSC storage. Views of sparse matrices are not well optimized since they fall back to generic operations which aren’t aware of the sparsity.


#4

So is the recommended approach to dispense with the view and just slice the matrix instead?


#5

Slicing would make a copy, which you don’t necessarily need. Depends how sensitive you are to the performance here vs how much special-purpose code you want to write for a particular operation.


#6

Yes, I realize it’ll make a copy, but in the end I think it’ll be a lot faster. Though, even though they are sparse 1e6 filled elements in the matrix is realistic for my use. I am sensitive to the performance though, the issue is that the elements are never contiguous in my actual use case. I think writing the specialized code is beyond me.


#7

Do any of you know where the specialized sum code for SparseMatrixCSC is located? I can’t seem to find it in Base.


#8

Probably calls into this? https://github.com/JuliaLang/julia/blob/4dbfe4b31deacc7cb586ff0093525724382c42c9/base/sparse/sparsematrix.jl#L1594


#9

@edit sum(sq) to look at the code, but essentially it boils down to sum(s.nzval). Anyway, have a look at the doc of nzrange for an example of an iteration over the non-zeros of a sparse matrix.


#10

If you use a matrix product instead of a view to compute the sum, it’ll be fast.

d = sprand(Bool,10000,10000, 0.01)
i = rand(1:10000,5000)
j = rand(1:10000,9000)
e = view(d, i, j)

s = sum(e,1)
t = (sparse(ones(Int,length(i)), i, ones(Int,length(i)),1,10000)*d)[j]
@assert s[:] == t

#11

That’s plain amazing :slight_smile:


#12

I’ve opened a feature request for the general functionality: https://github.com/JuliaLang/julia/issues/21796


#13

The real problem is one of iteration—you want to visit just the relevant entries in an arbitrary AbstractArray. https://github.com/timholy/ArrayIteration.jl might be a solution some day, but it needs some compiler improvements to be blazingly fast. (See https://github.com/JuliaLang/julia/issues/21796#issuecomment-300789568 for more information.)

It still might be worth playing with—in particular, it might be worth “finishing” (I haven’t had the time lately) and registering as a package, even if it’s not as fast as it could be someday. If anyone wants to join the fun, I’d be happy to help coach folks through anything that’s not clear.