Sparse matrix 700x slower than full

Ribeiro · February 23, 2021, 9:30pm

Hello,
I have a big matrix (20000x20000, Float64) which I build in a for loop. The for loop fills that matrix 2 non-consecutive columns at a time (e.g., in step 5 it may fill columns 3 and 5000, from row 1 to 20000). This is obviously a nightmare from a memory access point of view.
If I use a regular matrix, this operation takes less than a second, but uses up a ton of RAM. Using a sparse matrix (initialized with spzeros(20000,20000) I have great RAM reduction (only 0.3% of the matrix is non-zero, for a certain case), but the function takes 8 minutes!
Is this normal, or am I doing something very wrong? Maybe the nature of how I fill the matrix is the to blame. Not sure I can change that too easily.
Thanks a lot!

cgeoga · February 23, 2021, 9:38pm

That’s not particularly surprising. I’m not the most qualified person to explain this, but I think the way that your loop filling the dense matrix accesses memory is probably much more efficient.

If you know which entries of the matrix will be nonzero, however, there are more advanced methods for constructing the sparse matrix than a double loop, if that’s what you’re doing right now. Maybe you could share some code for people to look at?

Henrique_Becker · February 23, 2021, 9:52pm

What happens if you build the matrix in a non-sparse way, then call SparseArrays.sparse over it to build the sparse version, and throw away the non-sparse one (simply do not return it from the scope where it is created, or set the binding to nothing)?

rdeits · February 23, 2021, 10:18pm

Setting coefficients of a sparse matrix is slow because it generally requires moving around all of the data after whatever coefficient you’ve set, which can be quite expensive. That’s why it’s generally recommended to build up a vectors of row indices, column indices, and values, and then call sparse(I, J, V) to build the sparse matrix all at once, ref: Sparse Arrays · The Julia Language

Ribeiro · February 24, 2021, 6:04pm

Thanks, guys. This is all useful.
I will try @Henrique_Becker 's idea (though I would have a RAM spike I’d like to avoid) and I think implementing @rdeits ’ idea would be easy enough (it will have a lot of push! calls, which is probably not super fast, but whatever). Can the ‘I,J,V’ arrays be in any order? Sorting would cost some time, but if I’m going to do this, I should do it in a way that gets me the best performance possible. I’d guess row sorted.
Thanks again!

rdeits · February 24, 2021, 6:23pm

You should test that assumption. push! is amortized O(1) and should be quite efficient, although it does have some cost.

They can be in any order and can contain duplicates. The sparse() function will take care of all of this for you, including merging duplicate entries (adding them together by default). The whole point of the I, J, V constructor is that it allows sparse() to sort your entries and then populate the final matrix once without having to reorganize all its data.

stillyslalom · February 24, 2021, 7:50pm

You can also use sizehint! if you have a rough idea of how many elements you’ll need.

function f(n)
    out = Int[]
    for i = 1:n
        push!(out, i)
    end
    out
end

function g(n)
    out = Int[]
    sizehint!(out, n)
    for i = 1:n
        push!(out, i)
    end
    out
end

julia> @btime f(10^8);
  1.150 s (29 allocations: 888.58 MiB)

julia> @btime g(10^8);
  641.243 ms (2 allocations: 762.94 MiB)

Ribeiro · February 24, 2021, 9:12pm

Thanks a lot, guys. This is great info.
Great to know that sparse adds duplicates.

Topic		Replies	Views
Fastest way to fill a Sparse Matrix? Performance question	3	1814	April 21, 2022
Huge sparse array construction General Usage sparse	9	869	April 12, 2020
Efficient Initialization of huge sparse arrays New to Julia	6	3166	September 15, 2019
How to speed up creating a sparse matrix? Performance	2	536	May 5, 2020
How to build this sparsematrix in Julia General Usage question , linearalgebra , sparse	43	1458	July 30, 2022

Sparse matrix 700x slower than full

Related topics