Sparse instantiation from dense with simultaneous dropzeros

I sometimes have the need to construct a sparse matrix from a dense matrix with a fair number of numerical zeros. Currently, I have to duplicate the matrix in as SparseMatrixCSC with sparse() and then drop the zeros with dropzeros!. For large matrices with large numbers of numerical zeros this can take a lot of memory. Has anybody tackled this?

SC

But sparse doesn’t keep the numerical zeros from the dense matrix?

It does.

julia> A=sparse([1e-1 1e-20; 2 3])
2×2 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
  [1, 1]  =  0.1
  [2, 1]  =  2.0
  [1, 2]  =  1.0e-20
  [2, 2]  =  3.0

dropzeros won’t drop the zeros either, because it filters only for hard zeros. Instead I’ll have to use Base.SparseArrays.fkeep!(sparse(A),(x,y,z) -> z>eps(typeof(z))) or some such thing.

Ok, now I see what you mean. Try.

function to_sparse(A, eps)
    I, J, V = Int[], Int[], Float64[]
    for i in size(A, 1), j in size(A,2)
        v = A[i, j]
        if v > eps
            push!(I, i)
            push!(J, j)
            push!(V, v)
        end
    end
    return sparse(I, J, V)
end

Actually, no need to round trip via IJV

function to_sparse(A, eps)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Float64[]
    rowval = Int[]
    colptr[1] = 1
    cnt = 1
    for i in 1:size(A, 1)
        for j in 1:size(A,2)
            v = A[i, j]
            if v > eps
                push!(rowval, j)
                push!(nzval, v)
                cnt += 1
            end
        end
        colptr[i+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

It might be petter performance to count how many stored elements you have a priori and allocate arrays with correct size up front.

That’s good, thanks! I’ve taken that and amended it by what fkeep! does
to

function to_sparse(A, f)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Float64[]
    rowval = Int[]
    colptr[1] = 1
    cnt = 1
    for i in 1:size(A, 1)
        for j in 1:size(A,2)
            v = A[i, j]
            if f(i,j,v)
                push!(rowval, j)
                push!(nzval, v)
                cnt += 1
            end
        end
        colptr[i+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

in case I want to filter by other criteria during instantiation.

I made some noobie mistakes in my last post.

function tosparse(A, f)
    nz = count(!equalto(0), A)
    colptr = zeros(Int, size(A, 2) + 1)
    nzval = Vector{Float64}(uninitialized, nz)
    rowval = Vector{Int}(uninitialized, nz)
    colptr[1] = 1
    cnt = 1
    @inbounds for j in 1:size(A,2)
        for i in 1:size(A, 1)
            v = A[i, j]
            if f(i, j, v)
                rowval[cnt] = i
                nzval[cnt] = v
                cnt += 1
            end
        end
        colptr[j+1] = cnt
    end
    return SparseMatrixCSC(size(A, 1), size(A, 2), colptr, rowval, nzval)
end

Should be much better (traverses the matrix in the correct order)

Great use of count! Since we’re using the filter f, we need to modify count to

using Base.Cartesian
nz = 0
@nloops 2 i A begin
   nz += f(i_1,i_2,A[i_1,i_2])
end

Oh yeah, good point.