a = sparse(b)
a.rowvals
rowvals only gives the indices of the rows and not the column.
How to get the row and column indices as a pair, the way it is displayed when entire vector is called
a = sparse(b)
a.rowvals
rowvals only gives the indices of the rows and not the column.
How to get the row and column indices as a pair, the way it is displayed when entire vector is called
Have a look at this and the body of the involved functions (use @edit):
help?> nzrange
search: nzrange
nzrange(A::SparseMatrixCSC, col::Integer)
Return the range of indices to the structural nonzero values of a sparse matrix column. In conjunction with nonzeros and rowvals, this allows for convenient
iterating over a sparse matrix :
A = sparse(I,J,V)
rows = rowvals(A)
vals = nonzeros(A)
m, n = size(A)
for i = 1:n
for j in nzrange(A, i)
row = rows[j]
val = vals[j]
# perform sparse wizardry...
end
end
Yes, I found that later. But I didn’t quite understand how it gives the i,j location pair of the element as is desired.
Loop over the columns as j
, and nzrange(A,j)
gives you the rows for column j
which have nonzeros.
No, it gives you the indices into A.rowval
, which contains the rows, e.g.
julia> A = [1 0 0;
1 8 7;
2 9 0] |> sparse;
julia> A.rowval[nzrange(A, 2)]
2-element Array{Int64,1}:
2
3
julia> A.rowval[nzrange(A, 3)]
1-element Array{Int64,1}:
2
I never get this right… this can really be improved. I opened an issue to see if not only this can be improved, but also see if it can be standardized beyond SparseMatrixCSC
:
SparseMatrices are not good to index with Cartesian indices and if you want to apply something to do something to all items, why not use the higher level functions like map
?.
Working efficiently with sparse matrices will in many cases mean working directly with the internal representation.
Maybe you can just multiply by the vector (0,0,...0,1,0,...,0)
Sorry, I don’t understand.
Who asked to index sparse matrices with a Cartesian? I specifically asked for efficient iterators to avoid looping through non-zero elements with Cartesians. Of course those iterators have be specialized to the internal representation to be efficient, but that’s well within the realm of what’s possible if the iterator could return some indexing type. I am not asking for something to write matrix multiplications with, of course that needs to be really specialized to the sparse array. But I think that the ability to write a loop over the non-zero elements without having to make it a map
would be helpful in a lot of cases.
Agreed. One reason is the growing use of sparse representations in ML. Performing quick iterations is necessary.
Anyway, the answer to the question in OP is:
julia> A = sprand(5, 5, 0.2);
julia> I, J, V = findnz(A);
julia> indices = collect(zip(I,J));
julia> indices
6-element Array{Tuple{Int64,Int64},1}:
(1, 1)
(2, 1)
(3, 2)
(3, 4)
(1, 5)
(5, 5)
julia> V
6-element Array{Float64,1}:
0.0129232
0.187362
0.713473
0.595695
0.415127
0.640516
Yes! That works. Thank you @kristoffer.carlsson!
And here is an iterator that gives the cartesian indices and the values:
struct SparseMatrixCSC_StoredValuesIterator{T}
A::T
end
Base.start(S::SparseMatrixCSC_StoredValuesIterator) = 1, 1
function Base.next(S::SparseMatrixCSC_StoredValuesIterator, state)
i, col = state
while i > S.A.colptr[col+1] - 1
col += 1
end
return (S.A.rowval[i], col, S.A.nzval[i]), (i + 1, col)
end
function Base.done(S::SparseMatrixCSC_StoredValuesIterator, state)
i, col = state
return i > S.A.colptr[end] - 1
end
stored_indvals(S::SparseMatrixCSC) = SparseMatrixCSC_StoredValuesIterator(S)
Used as:
julia> A = sprand(6, 6, 0.4);
julia> full(A)
6Ă—6 Array{Float64,2}:
0.0 0.0 0.0 0.056908 0.0 0.0
0.0 0.328863 0.0 0.0 0.522739 0.0
0.0 0.0 0.0 0.0 0.559482 0.498087
0.0 0.0360203 0.0 0.302467 0.0 0.111877
0.0 0.489175 0.77902 0.117499 0.0 0.0
0.0 0.513361 0.0 0.801586 0.0 0.0582955
julia> for (i, j, v) in stored_indvals(A)
println("A[$i, $j] = $v")
end
A[2, 2] = 0.32886263776211333
A[4, 2] = 0.03602034051160041
A[5, 2] = 0.489174826870322
A[6, 2] = 0.5133613097745331
A[5, 3] = 0.7790200708345987
A[1, 4] = 0.0569080303985654
A[4, 4] = 0.30246686231168973
A[5, 4] = 0.11749932772395755
A[6, 4] = 0.8015857984258852
A[2, 5] = 0.5227388963966788
A[3, 5] = 0.5594820109904008
A[3, 6] = 0.49808711016882445
A[4, 6] = 0.1118766615390987
A[6, 6] = 0.05829550762538971
Note, haven’t tested this one properly, it might have some off by one or fail on other edge cases.
Check out https://github.com/timholy/ArrayIteration.jl which has an awesome proof of concept.