Efficient Initialization of huge sparse arrays

iamsuddhasattwa · September 12, 2019, 2:28pm

There are two ways one can initialize a NXN sparse matrix, whose entries are to be read from one/multiple text files. Which one is faster ? I need the more efficient one, as N is large, typically 10^6.

I could store the (x,y) indices in arrays x, y, the entries in an array v and declare
K = sparse(x,y,value);
I could declare
K = spzeros(N)
then read of the (i,j) coordinates and values v and insert them as
K[i,j]=v;
as they are being read.

I found no tips about this in Julia’s page on sparse arrays.

PetrKryslUCSD · September 12, 2019, 2:32pm

Don’t insert values one by one: that will be tremendously inefficient since the storage in the sparse matrix needs to be reallocated over and over again.

iamsuddhasattwa · September 12, 2019, 2:46pm

Thanks, that is exactly the information I needed.

rdeits · September 12, 2019, 3:01pm

You can also use BenchmarkTools.jl to verify this:

julia> using SparseArrays

julia> using BenchmarkTools

julia> I = rand(1:1000, 1000); J = rand(1:1000, 1000); X = rand(1000);

julia> function fill_spzeros(I, J, X)
         x = spzeros(1000, 1000)
         @assert axes(I) == axes(J) == axes(X)
         @inbounds for i in eachindex(I)
           x[I[i], J[i]] = X[i]
         end
         x
       end
fill_spzeros (generic function with 1 method)

julia> @btime sparse($I, $J, $X);
  10.713 μs (12 allocations: 55.80 KiB)

julia> @btime fill_spzeros($I, $J, $X);
  96.068 μs (22 allocations: 40.83 KiB)

pablosanjose · September 12, 2019, 6:28pm

Just for completeness, if you need to build/update sparse matrices repeatedly, have a look also at the parent method SparseArray.sparse! (not exported)

EDIT: incidentally, you can even avoid allocating, building and processing your x, y, value arrays if you can generate your matrix nonzeros ordered by column. Then, you can build a SparseMatrixCSC directly by generating its internal colptrs, rowvals and nzval fields efficiently. Not sure if you want to mess with such details, though I think you can gain quite a bit of efficiency this way

iamsuddhasattwa · September 15, 2019, 9:09pm

Thank you for the painstaking effort !

iamsuddhasattwa · September 15, 2019, 9:19pm

Thanks for the tip !

Topic		Replies	Views
Fastest way to fill a Sparse Matrix? Performance question	3	1811	April 21, 2022
Sparse matrix 700x slower than full New to Julia sparse	7	1066	February 24, 2021
Efficient way for assigning a massive array New to Julia array	6	687	January 27, 2020
Huge sparse array construction General Usage sparse	9	867	April 12, 2020
How to speed up creating a sparse matrix? Performance	2	536	May 5, 2020

Efficient Initialization of huge sparse arrays

Related topics