Sparse instantiation from dense with simultaneous dropzeros

question

#1

I sometimes have the need to construct a sparse matrix from a dense matrix with a fair number of numerical zeros. Currently, I have to duplicate the matrix in as SparseMatrixCSC with sparse() and then drop the zeros with dropzeros!. For large matrices with large numbers of numerical zeros this can take a lot of memory. Has anybody tackled this?

SC


#2

But sparse doesn’t keep the numerical zeros from the dense matrix?


#3

It does.

julia> A=sparse([1e-1 1e-20; 2 3])
2×2 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
  [1, 1]  =  0.1
  [2, 1]  =  2.0
  [1, 2]  =  1.0e-20
  [2, 2]  =  3.0

dropzeros won’t drop the zeros either, because it filters only for hard zeros. Instead I’ll have to use Base.SparseArrays.fkeep!(sparse(A),(x,y,z) -> z>eps(typeof(z))) or some such thing.


#4

Ok, now I see what you mean. Try.

function to_sparse(A, eps)
    I, J, V = Int[], Int[], Float64[]
    for i in size(A, 1), j in size(A,2)
        v = A[i, j]
        if v > eps
            push!(I, i)
            push!(J, j)
            push!(V, v)
        end
    end
    return sparse(I, J, V)
end

#5

Actually, no need to round trip via IJV

function to_sparse(A, eps)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Float64[]
    rowval = Int[]
    colptr[1] = 1
    cnt = 1
    for i in 1:size(A, 1)
        for j in 1:size(A,2)
            v = A[i, j]
            if v > eps
                push!(rowval, j)
                push!(nzval, v)
                cnt += 1
            end
        end
        colptr[i+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

It might be petter performance to count how many stored elements you have a priori and allocate arrays with correct size up front.


#6

That’s good, thanks! I’ve taken that and amended it by what fkeep! does
to

function to_sparse(A, f)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Float64[]
    rowval = Int[]
    colptr[1] = 1
    cnt = 1
    for i in 1:size(A, 1)
        for j in 1:size(A,2)
            v = A[i, j]
            if f(i,j,v)
                push!(rowval, j)
                push!(nzval, v)
                cnt += 1
            end
        end
        colptr[i+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

in case I want to filter by other criteria during instantiation.


#7

I made some noobie mistakes in my last post.

function tosparse(A, f)
    nz = count(!equalto(0), A)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Vector{Float64}(uninitialized, nz)
    rowval = Vector{Int}(uninitialized, nz)
    colptr[1] = 1
    cnt = 1
    @inbounds for j in 1:size(A,2)
        for i in 1:size(A, 1)
            v = A[i, j]
            if f(i, j, v)
                rowval[cnt] = i
                nzval[cnt] = v
                cnt += 1
            end
        end
        colptr[j+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

Should be much better (traverses the matrix in the correct order)


#8

Great use of count! Since we’re using the filter f, we need to modify count to

using Base.Cartesian
nz = 0
@nloops 2 i A begin
   nz += f(i_1,i_2,A[i_1,i_2])
end

#9

Oh yeah, good point.