Why operation on return of @views take a long time

Hi there,

In this scenario, I have a large sparse matrix, and I want to filter out certain rows and columns based on specific conditions, such as:

nrow = 10000
ncol = 10000
M = sprand(nrow,ncol,0.0001)
rowIdx = rand(Bool,nrow)
colIdx = rand(Bool,ncol)

@time result = M[rowIdx,colIdx];
 0.014542 seconds (11 allocations: 156.281 KiB)
@time @views result2 = M[rowIdx,colIdx];
 0.000140 seconds (5 allocations: 78.547 KiB)

I understand that using @views significantly speeds up indexing. However, any operation performed on the results of @views tends to be much slower.

@time sum(result);
  0.000065 seconds (1 allocation: 16 bytes)
@time sum(result2);
  0.050533 seconds (1 allocations: 16 bytes)

Why is the sum operation on result2 so slow? And do you have any suggestions to improve the speed?

Great Thanks!

Well, a view re-directs the memory access to the elements of the underlying matrix. This means creating a view is fast, but accessing the elements of a view has some overhead.

2 Likes

In particular, the performance you get out of views directly stems from the arrangement (and types!) of the arrays and indices you use. This is perhaps a worst-case scenario on both counts.

Sparse arrays “know” which elements are nonzero and can sum them very fast — as long as they’re not shuffled up with an indirection! There’s a correlation between the time it takes to create the copy’ed subset (which is doing work ahead of time to preserve that knowledge) and the time it takes to do that sum on the view. Both are doing non-trivial work to figure out where the nonzeros are.

6 Likes

Is this representative of what you actually want to do with the sparse matrix M (sum a rectangular submatrix)? If so, you’re much better off leveraging the actual content of the SparseMatrixCSC struct, documented here.

using SparseArrays, BenchmarkTools

submatrix_sum_slow(M, rowIdx, colIdx) = sum(@view M[rowIdx, colIdx])
submatrix_sum_medium(M, rowIdx, colIdx) = sum(M[rowIdx, colIdx])

function submatrix_sum_fast(M::SparseMatrixCSC, rowIdx, colIdx)
    nz = nonzeros(M)
    rv = rowvals(M)
    s = zero(eltype(M))
    for j in axes(M, 1)
        colIdx[j] || continue
        for k in nzrange(M, j)
            i = rv[k]
            rowIdx[i] || continue
            s += nz[k]
        end
    end
    return s
end

You get a nearly x1000 speedup:

julia> submatrix_sum_slow(M, rowIdx, colIdx) == submatrix_sum_fast(M, rowIdx, colIdx)
true

julia> @btime submatrix_sum_slow($M, $rowIdx, $colIdx);
  59.379 ms (4 allocations: 78.78 KiB)

julia> @btime submatrix_sum_medium($M, $rowIdx, $colIdx);
  10.391 ms (10 allocations: 157.73 KiB)

julia> @btime submatrix_sum_fast($M, $rowIdx, $colIdx);
  83.589 ÎĽs (0 allocations: 0 bytes)
2 Likes

Thank you for your help! Yes, in the case where it is still a sparse array, your suggestion is excellent!

I now understand the issue in my scenario arises from the need to operate on a subset of the sparse array. When using @views, the result is no longer a sparse array, which makes subsequent operations behave more like those on dense data (My guess, may not accurate in detail).

So, it seems that if I cannot avoid creating a subset from a sparse array, after performing a series of subset operations, it would be better to convert the result back to a sparse format?

I really depends what kind of operations you need to perform on the subset, as well as the dimensions and sparsity of the array. Do you have a slightly more realistic MWE to share?

It seems the result is not a dense array if that’s what you mean (I at least see “view(::SparseMatrixCSC{Float64, Int64}” …). A view on a dense array is fast (not as fast as a copy is, but almost I think, only skips the time to make that copy). A view on a sparse array is comparatively much slower (and a dense copy of the subarray would be even slower).

So I’m thinking should the docs on @view and @views mention they are slow for (some array types such as) sparse arrays?

I don’t think disallowing views on sparse arrays would be nice, since a breaking change, and also because we want to be able to make generic code that works on all array types. Copies/non-views are also nice since disallowing (accidental) mutability. I could have seen views on sparse arrays actually making an actual sparse copy, that just disallows mutability…

This was all probably intentional, or redoing would you have only allowed views on sparse matrices, sacrificing generic code?

1 Like

No, the really bad thing is that subsequent operations use algorithms intended for dense data, but apply them to sparse data that has super duper O(log(N))-slow getindex.

Because nobody was/is motivated to write & maintain e.g. sum-specializations for View{SparseMatrixCSC...}

PS. I think a seriously missing functionality is getindex_dense(M_sparse, rows_sorted, cols_sorted) that extracts a submatrix from a sparse matrix, in dense format. This oversight / missing functionality is not unique to julia, it is imo especially glaring for CuSparse!

(you can write a short slow bit of code for that, but you really want to write 10 different versions using SIMD for different CPU flavors. This is a problem that rewards non-naive code)

2 Likes