# Is there a lazy sparse matrix constructor?

Constructing a sparse matrix via the following is slow:

``````n = 200
m = 250

A = spzeros(n,m)

for i in 1:n
for j in 1:m
A[i,j] = rand()
end
end
``````

The way to speed this up is to use the following syntax:

``````cur_I = Int[]
cur_J = Int[]
cur_V = Number[]

for i in 1:n
append!(cur_I, repeat([i], m))
for j in 1:m
push!(cur_J, j)
push!(cur_V, rand())
end
end

A = sparse(cur_I, cur_J, cur_V, n, m)
``````

This begs the question, why isn’t there a constructor that can lazily put off construction of a sparse matrix until all the elements are ready?

The pseudocode for the result would be something like:

``````A = spzeros_lazy(n,m)

for i in 1:n
for j in 1:m
A[i,j] = rand()
end
end

lazy_sparse_constructor!(A)
``````

// the goal is not to be the most performant, just performant && easy to code

1 Like

Simple enough to code:

``````using SparseArrays

struct LazySparse{T, TV <: AbstractVector{T}, TI <: Integer, TVI <: AbstractVector{TI}}
I::TVI
J::TVI
V::TV
m::TI
n::TI
end
LazySparse{T}(m::TI, n::TI) where {T, TI} = LazySparse(TI[], TI[], T[], m, n)

function Base.setindex!(a::LazySparse, v, i, j)
push!(a.I, i)
push!(a.J, j)
push!(a.V, v)
return v
end
SparseArrays.sparse(a::LazySparse) = sparse(a.I, a.J, a.V, a.m, a.n)
``````

Example:

``````m = n = 2
A = LazySparse{Float64}(m, n)

for i in 1:m
for j in 1:n
A[i,j] = rand()
end
end

sparse(A)
``````
8 Likes

I guess now the question is can you make it smart?

By this, I mean is there a way to remove the explicit call to `sparse()`?

This would require:

1. enforcing any non-`setindex!` call to first call `sparse` and cache it
2. passing all calls to the cached matrix after the initial non-`setindex!` call

edit: another step could be to clear the cache every time a person calls `setindex!` (but that maybe shows why it’s not a good idea to make it smart)

You can just define the functions that you want to have materialize the sparse array which then call the `Base`/`LinearAlgebra` versions after a call to `sparse`. Metaprogramming can make life easier here. The second task can be achieved by memoizing the output of `sparse(::LazySparse)`. There is Memoize.jl or you can use Cassette for that. Would be interesting to see if there are any undesirable side effects to this proposal.