Fastest way to fill a Sparse Matrix?

Hello guys, I am wondering which should be the right strategy to fill values in a big sparse matrix.

I have a variable L, which is an n x m matrix, and is sparse. An outer loop runs on the row index, and just a few values of the row are non zero. The non zero values are contained in a vector d, which changes size in every iterations. The pseudo code would be something like this:

L=zeros(n,m)
for i in 1:m
    for j in 1:n
         L[i,j]=d[j]
    end
end

Only 5% of the values are different from zero. If I initialize the matrix L as a Sparse Array (using spzeros), the code becomes slow (I found in another post that the best way to operate with sparse arrays is by just using the Is, Js, ans Vs vectors and build the matrix using the function sparse).

If a try to build a vector, only with the values of d (to then build the matrix L using the function sparse), since the size of d is unknown in each iteration, I have to initialize d as an empty vector that changes size in each loop, making it run slow once again.

For this particular case I can’t find a way to take advantage of the sparse arrays to make the code more efficient.

Any thoughts? Thanks

This is how I understand it. I tried to translate your problem to use sparse

using BenchmarkTools
using SparseArrays

function test(ds)
    is = Int[]
    js = Int[]
    vs = Float64[]
    for j = 1:size(ds, 2)
        d = ds[:, j]
        dis, dvs = findnz(d)
        for (i, v) in zip(dis, dvs)
            push!(is, i)
            push!(js, j)
            push!(vs, v)
        end
    end
    L = sparse(is, js, vs, size(ds, 1), size(ds, 2))
    @assert L == ds
    L
end

@btime test(ds) setup=(ds=sprand(10000, 10000, 0.05))

Does this help?

Yes, It helped a lot! Thank you so much! the key was in the function:

push!

I was using

vcat & hcat

Which seems to run much more slowly. Thank again for your support.

Yep, cat is used to concatenate Arrays and does more work. But be careful with the order of loops and accesses to your sparse matrix, because it is in (C)ompressed(S)parse(C)olumn format.

1 Like