Zero elements in sparse matrix

How to find the index of zero elements in a sparse matrix efficiently? I have a matrix with a size of 200k by 200k.

Can you instead find the indices of nonzero elements (since there are way fewer of them)?

1 Like

That one is easy by just using findnz function.

The issue is that your 200k*200k matrix has 40 billion entries. You’d need an unusually large RAM to even store the indices of the nonzeros (roughly 320GB for Int32 coordinates), assuming the nonzeros made up a significant fraction of the matrix (which they must or else the matrix itself would be hundreds of GB).

findall(iszero, X) will find them for you, but for the aforementioned reasons will be likely to fail for your large matrix.

You can make an iterator to loop over the nonzero entries

X = sparse(1:3, 1:3, 1.0)
itr = (I for I in eachindex(X) if iszero(X[I]))

collect(itr) # *for demonstration purposes only - do not collect the iterator*
# 6-element Vector{CartesianIndex{2}}:
#  CartesianIndex(2, 1)
#  CartesianIndex(3, 1)
#  CartesianIndex(1, 2)
#  CartesianIndex(3, 2)
#  CartesianIndex(1, 3)
#  CartesianIndex(2, 3)

Again, do not collect the iterator for your large matrix. Your RAM is unlikely to be big enough. Just use the iterator in a for loop or whatever else it was you were planning to do. Or just loop over eachindex directly and only do the work when you find a zero.

But more fundamentally, you should see if you can revise what it is you’re trying to do. Doing something with the zeros of a sparse matrix is the opposite of how you should usually try to operate. And looping over 40 billion entries will take a decent amount of time even if you do very little work with each one.