I sometimes have the need to construct a sparse matrix from a dense matrix with a fair number of numerical zeros. Currently, I have to duplicate the matrix in as SparseMatrixCSC with sparse() and then drop the zeros with dropzeros!. For large matrices with large numbers of numerical zeros this can take a lot of memory. Has anybody tackled this?
SC
But sparse doesn’t keep the numerical zeros from the dense matrix?
It does.
julia> A=sparse([1e-1 1e-20; 2 3])
2×2 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
[1, 1] = 0.1
[2, 1] = 2.0
[1, 2] = 1.0e-20
[2, 2] = 3.0
dropzeros won’t drop the zeros either, because it filters only for hard zeros. Instead I’ll have to use Base.SparseArrays.fkeep!(sparse(A),(x,y,z) -> z>eps(typeof(z))) or some such thing.
Ok, now I see what you mean. Try.
function to_sparse(A, eps)
I, J, V = Int[], Int[], Float64[]
for i in size(A, 1), j in size(A,2)
v = A[i, j]
if v > eps
push!(I, i)
push!(J, j)
push!(V, v)
end
end
return sparse(I, J, V)
end
Actually, no need to round trip via IJV
function to_sparse(A, eps)
colptr = zeros(Int, size(A, 2) + 1)
nzval = Float64[]
rowval = Int[]
colptr[1] = 1
cnt = 1
for i in 1:size(A, 1)
for j in 1:size(A,2)
v = A[i, j]
if v > eps
push!(rowval, j)
push!(nzval, v)
cnt += 1
end
end
colptr[i+1] = cnt
end
return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end
It might be petter performance to count how many stored elements you have a priori and allocate arrays with correct size up front.
That’s good, thanks! I’ve taken that and amended it by what fkeep! does
to
function to_sparse(A, f)
colptr = zeros(Int, size(A, 2) + 1)
nzval = Float64[]
rowval = Int[]
colptr[1] = 1
cnt = 1
for i in 1:size(A, 1)
for j in 1:size(A,2)
v = A[i, j]
if f(i,j,v)
push!(rowval, j)
push!(nzval, v)
cnt += 1
end
end
colptr[i+1] = cnt
end
return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end
in case I want to filter by other criteria during instantiation.
I made some noobie mistakes in my last post.
function tosparse(A, f)
nz = count(!equalto(0), A)
colptr = zeros(Int, size(A, 2) + 1)
nzval = Vector{Float64}(uninitialized, nz)
rowval = Vector{Int}(uninitialized, nz)
colptr[1] = 1
cnt = 1
@inbounds for j in 1:size(A,2)
for i in 1:size(A, 1)
v = A[i, j]
if f(i, j, v)
rowval[cnt] = i
nzval[cnt] = v
cnt += 1
end
end
colptr[j+1] = cnt
end
return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end
Should be much better (traverses the matrix in the correct order)
Great use of count! Since we’re using the filter f, we need to modify count to
using Base.Cartesian
nz = 0
@nloops 2 i A begin
nz += f(i_1,i_2,A[i_1,i_2])
end