Mul! vs SparseMatrix

ctkelley · December 12, 2020, 2:44pm

I’m trying to use mul! to compute the product of a sparse matrix and a sparse diagonal matrix. My problem is that mul! allocates very litte, but takes a lot more time than a matrix multiply.

Is this what’s supposed to happen?

Herewith an example. The starting matrix L is the discrete Laplacian in two space dimension on a square with zero boundary conditions. The grid has n \times n interior nodes with n=31

julia> typeof(L)
SparseMatrixCSC{Float64,Int64}

julia> u=rand(n*n,);

julia> D=spdiagm(0=>u);

julia> B=copy(L);

julia> @btime $B .= $D*$L;
  34.406 μs (8 allocations: 171.64 KiB)

julia> @btime mul!($B, $D, $L);
  690.153 ms (6 allocations: 336 bytes)

moeddel · December 12, 2020, 10:12pm

The reason is that D*L is computed by the spmatmul method from the SparseArrays package here, whereas mul!(B,D,L) does not invoke the specialized sparse matrix-matrix multiplication, but the generic_matmatmul! from the LinearAlgebra subpackage here which does not exploit sparsity.

There is simply no specialized method mul!(B,D,L) for B, D and L being of type SparseMatrixCSC (spdiagm(0=>u) is also of type SparseMatrixCSC since there is no specialized type for a sparse diagonal matrix in SparseArrays). However, if you use a dense diagonal matrix D=Diagonal(u) (type Diagonal) then you will see what you likely expect.

n=31
L=sprandn(n*n,n*n,0.01)
u=randn(n*n)
D=Diagonal(u)
B=copy(L)

gives

@btime $B .= $D*$L
38.936 μs (5 allocations: 151.47 KiB)

and

@btime mul!($B, $D, $L)
17.935 μs (0 allocations: 0 bytes)

As a side note your minimal working example was not working, since you forgot to provide L, which is likely why it took so long for someone to respond .

ctkelley · December 12, 2020, 10:22pm

Thanks. That did it. FWIW, here’s L

Do

julia> L=Lap2d(63);

where

“”"

Lap2d(n)

returns the negative Laplacian in two space dimensions

on n x n grid.

Unit square, homogeneous Dirichlet BC

"""

function Lap2d(n)

h=1/(n+1)

maindiag=4*ones(n^2,)/(h*h)

mdiag=Pair(0,maindiag);

sxdiag=-ones(n^2-1,)/(h*h);

for iz=n:n:n^2-1

sxdiag[iz]=0.0;

end

upxdiag=Pair(1,sxdiag);

lowxdiag=Pair(-1,sxdiag);

sydiag=-ones(n^2-n,)/(h*h);

upydiag=Pair(n,sydiag);

lowydiag=Pair(-n,sydiag);

L2d=spdiagm(lowydiag, lowxdiag, mdiag, upxdiag, upydiag);

return L2d

end