Is there a lazy sparse matrix constructor?

djsegal · March 3, 2019, 8:40am

Constructing a sparse matrix via the following is slow:

n = 200
m = 250

A = spzeros(n,m)

for i in 1:n
  for j in 1:m
    A[i,j] = rand()
  end
end

The way to speed this up is to use the following syntax:

cur_I = Int[]
cur_J = Int[]
cur_V = Number[]

for i in 1:n
  append!(cur_I, repeat([i], m))
  for j in 1:m
    push!(cur_J, j)
    push!(cur_V, rand())
  end
end

A = sparse(cur_I, cur_J, cur_V, n, m)

This begs the question, why isn’t there a constructor that can lazily put off construction of a sparse matrix until all the elements are ready?

The pseudocode for the result would be something like:

A = spzeros_lazy(n,m)

for i in 1:n
  for j in 1:m
    A[i,j] = rand()
  end
end

lazy_sparse_constructor!(A)

// the goal is not to be the most performant, just performant && easy to code

mohamed82008 · March 3, 2019, 8:56am

Simple enough to code:

using SparseArrays

struct LazySparse{T, TV <: AbstractVector{T}, TI <: Integer, TVI <: AbstractVector{TI}}
	I::TVI
	J::TVI
	V::TV
	m::TI
	n::TI
end
LazySparse{T}(m::TI, n::TI) where {T, TI} = LazySparse(TI[], TI[], T[], m, n)

function Base.setindex!(a::LazySparse, v, i, j)
	push!(a.I, i)
	push!(a.J, j)
	push!(a.V, v)
	return v
end
SparseArrays.sparse(a::LazySparse) = sparse(a.I, a.J, a.V, a.m, a.n)

Example:

m = n = 2
A = LazySparse{Float64}(m, n)

for i in 1:m
  for j in 1:n
    A[i,j] = rand()
  end
end

sparse(A)

djsegal · March 3, 2019, 8:05pm

I guess now the question is can you make it smart?

By this, I mean is there a way to remove the explicit call to sparse()?

This would require:

enforcing any non-setindex! call to first call sparse and cache it
passing all calls to the cached matrix after the initial non-setindex! call

edit: another step could be to clear the cache every time a person calls setindex! (but that maybe shows why it’s not a good idea to make it smart)

mohamed82008 · March 3, 2019, 9:54pm

You can just define the functions that you want to have materialize the sparse array which then call the Base/LinearAlgebra versions after a call to sparse. Metaprogramming can make life easier here. The second task can be achieved by memoizing the output of sparse(::LazySparse). There is Memoize.jl or you can use Cassette for that. Would be interesting to see if there are any undesirable side effects to this proposal.

Topic		Replies	Views
Sparse matrix 700x slower than full New to Julia sparse	7	1069	February 24, 2021
Fastest way to fill a Sparse Matrix? Performance question	3	1814	April 21, 2022
Huge sparse array construction General Usage sparse	9	869	April 12, 2020
Efficient Initialization of huge sparse arrays New to Julia	6	3166	September 15, 2019
Convenience type for assembling sparse matrices Internals & Design	5	866	February 21, 2023

Is there a lazy sparse matrix constructor?

Related topics