Well, a view re-directs the memory access to the elements of the underlying matrix. This means creating a view is fast, but accessing the elements of a view has some overhead.
In particular, the performance you get out of views directly stems from the arrangement (and types!) of the arrays and indices you use. This is perhaps a worst-case scenario on both counts.
Sparse arrays “know” which elements are nonzero and can sum them very fast — as long as they’re not shuffled up with an indirection! There’s a correlation between the time it takes to create the copy’ed subset (which is doing work ahead of time to preserve that knowledge) and the time it takes to do that sum on the view. Both are doing non-trivial work to figure out where the nonzeros are.
Is this representative of what you actually want to do with the sparse matrix M (sum a rectangular submatrix)? If so, you’re much better off leveraging the actual content of the SparseMatrixCSC struct, documented here.
using SparseArrays, BenchmarkTools
submatrix_sum_slow(M, rowIdx, colIdx) = sum(@view M[rowIdx, colIdx])
submatrix_sum_medium(M, rowIdx, colIdx) = sum(M[rowIdx, colIdx])
function submatrix_sum_fast(M::SparseMatrixCSC, rowIdx, colIdx)
nz = nonzeros(M)
rv = rowvals(M)
s = zero(eltype(M))
for j in axes(M, 1)
colIdx[j] || continue
for k in nzrange(M, j)
i = rv[k]
rowIdx[i] || continue
s += nz[k]
end
end
return s
end
Thank you for your help! Yes, in the case where it is still a sparse array, your suggestion is excellent!
I now understand the issue in my scenario arises from the need to operate on a subset of the sparse array. When using @views, the result is no longer a sparse array, which makes subsequent operations behave more like those on dense data (My guess, may not accurate in detail).
So, it seems that if I cannot avoid creating a subset from a sparse array, after performing a series of subset operations, it would be better to convert the result back to a sparse format?
I really depends what kind of operations you need to perform on the subset, as well as the dimensions and sparsity of the array. Do you have a slightly more realistic MWE to share?
It seems the result is not a dense array if that’s what you mean (I at least see “view(::SparseMatrixCSC{Float64, Int64}” …). A view on a dense array is fast (not as fast as a copy is, but almost I think, only skips the time to make that copy). A view on a sparse array is comparatively much slower (and a dense copy of the subarray would be even slower).
So I’m thinking should the docs on @view and @views mention they are slow for (some array types such as) sparse arrays?
I don’t think disallowing views on sparse arrays would be nice, since a breaking change, and also because we want to be able to make generic code that works on all array types. Copies/non-views are also nice since disallowing (accidental) mutability. I could have seen views on sparse arrays actually making an actual sparse copy, that just disallows mutability…
This was all probably intentional, or redoing would you have only allowed views on sparse matrices, sacrificing generic code?
No, the really bad thing is that subsequent operations use algorithms intended for dense data, but apply them to sparse data that has super duper O(log(N))-slow getindex.
Because nobody was/is motivated to write & maintain e.g. sum-specializations for View{SparseMatrixCSC...}
PS. I think a seriously missing functionality is getindex_dense(M_sparse, rows_sorted, cols_sorted) that extracts a submatrix from a sparse matrix, in dense format. This oversight / missing functionality is not unique to julia, it is imo especially glaring for CuSparse!
(you can write a short slow bit of code for that, but you really want to write 10 different versions using SIMD for different CPU flavors. This is a problem that rewards non-naive code)