Huge sparse array construction

Hello,
I have a huge sparse matrix (1 million x 1 million) which I am trying to efficiently construct. I am using the following code:

D_matrix = sparse(I, J, V)
dropzeros!(D_matrix,trim = true)

Where I,J and V are very long vectors. This line use the following resources:

  7.964829 seconds (56 allocations: 6.505 GiB, 19.91% gc time)

I was wondering if there is any way to reduce the memory consumption here. I thought of something like splitting the vectors into two and assigning the first half, then free it, then GC.gc(), then the other half. Is there any other way to save some memory here?

How many entries does your sparse matrix have? Is there any reason to think that Julia is using much more memory than the space required to actually store your sparse matrix?

2 Likes

Is V full of zeros? If so, delete them first.

1 Like

Not really. I tried that, no big difference.

Ok, then how long are I,J, and V?

length(V) = 217778220
length(I) = 217778220
length(J) = 217778220

If you can generate the non-zero entries of the sparse matrix by columns you can easily beat the sparse method by constructing the internal fields of SparseMatrixCSC directly. I do that often, but be advised that such an approach is not guaranteed to remain valid if somebody refactors the SparseMatrixCSC implementation at some point.

3 Likes

If you have 217778220 nonzero entries, then just storing the values (as 64-bit floating-point numbers) and indices (as 64-bit integers) requires

3 * 217778220 * 8 / 2^30

GiB, or 4.87GiB. So, the memory usage of 6.5GiB is not that surprising (that is, it indicates that there is a 34% overhead of allocated memory in the process of constructing the sparse-matrix data structure, which is not particularly large). You could save a bit of memory (33%) by using 32-bit integers for the indices, but fundamentally you will need several GiB for this matrix (unless there is some other structure you can exploit besides sparsity).

As @pablosanjose suggests, if you generate the CSC sparse format directly, instead of (I,J,V) data, then you can avoid allocating the sparse data twice so you can save a factor of 2 (or a little more, since the overhead goes away), and it will be faster.

6 Likes

Could you show an example how?
Maybe if it is so useful it should become official (At least to keep the option to do so, maybe the technic could change over time).

By the way, what about representing I and J in more elegant way? Some kind of iterator or concatenation of ranges as usually they have some kind of special form.

1 Like

I have some iterators in my (WIP) Quantica.jl package that I use for such a thing (in src/iterators.jl). The idea is to use an iterator a constructor s=SparseMatrixBuilder{T}() to which we can push an element val::T to a given row by subsequent calls to pushtocolumn!(s, row, val). When a column is done you jump to the next column with finalizecolumn!(s). At the end you get your sparse matrix with sparse(s). Internally, this just builds the SparseMatrixCSC struct incrementally. A small, slightly silly example

julia> using Quantica, SparseArrays

julia> nnonzeros = 217778220; n = 10^6; V = rand(nnonzeros);

julia> function growmatrix(V::Vector{T}, nrows) where {T}
           s = Quantica.SparseMatrixBuilder{T}()
           row = 0
           Δrow = nrows ÷ 10
           for val in V
               row + Δrow > nrows && Quantica.finalizecolumn!(s)
               row = rem(row + Δrow - 1, nrows) + 1
               Quantica.pushtocolumn!(s, row, val)
           end
           return sparse(s)
       end

julia> @time m = growmatrix(V, n);
 22.052439 seconds (68.88 k allocations: 3.637 GiB, 2.92% gc time)

This particular example appears quite a bit slower than your direct sparse call, but at least uses around half the memory. Not sure how useful this kind of approach is for the OP’s case

EDIT: the iterator above has itself some overhead because it sorts each column after it is pushed. If the input data is already column-sorted, that overhead could be reduced.

2 Likes