I am writing code which performs a five argument mul!
operation (from LinearAlgebra
) inside a for loop where performance is critical.
When using a sparse array instead of a matrix, the operation allocates memory:
julia> using SparseArrays, LinearAlgebra
julia> A = rand(25,25); B = rand(25); C = rand(25);
julia> A_sparse = SparseMatrixCSC(A);
julia> @time mul!(C, A, B, 1, 1);
0.137919 seconds (236.09 k allocations: 16.206 MiB, 32.65% gc time, 99.98% compilation time)
julia> @time mul!(C, A, B, 1, 1);
0.000006 seconds
julia> @time mul!(C, A_sparse, B, 1, 1);
0.514963 seconds (1.77 M allocations: 121.304 MiB, 5.29% gc time, 99.98% compilation time)
julia> @time mul!(C, A_sparse, B, 1, 1);
0.000007 seconds (2 allocations: 80 bytes)
The allocation goes away when the mul!
is performed in a loop inside a function:
julia> function f(A, B, C)
for i in 1:100
mul!(C, A, B, 1, 1)
end
end
f (generic function with 1 method)
julia> @time f(A,B,C);
0.011613 seconds (3.78 k allocations: 253.445 KiB, 99.70% compilation time)
julia> @time f(A,B,C);
0.000017 seconds
julia> @time f(A_sparse,B,C);
0.004931 seconds (6.90 k allocations: 486.984 KiB, 98.89% compilation time)
julia> @time f(A_sparse,B,C);
0.000039 seconds
julia> @time f(A_sparse,B,C);
0.000039 seconds
However, there is allocation when the matrix is sparse and the vectors are views:
julia> Bview = @view B[:];
julia> Cview = @view C[:];
julia> @time f(A_sparse,Bview,Cview);
0.320725 seconds (1.36 M allocations: 91.655 MiB, 24.13% gc time, 99.98% compilation time)
julia> @time f(A_sparse,Bview,Cview);
0.000039 seconds (100 allocations: 4.688 KiB)
On the other hand, there is not allocation when the matrix is dense and the vectors are views:
julia> @time f(A,Bview,Cview);
0.092996 seconds (179.02 k allocations: 12.220 MiB, 99.96% compilation time)
julia> @time f(A,Bview,Cview);
0.000016 seconds
Could someone explain to me why the matrix being sparse and the vectors being views causes memory allocation, and how I could avoid this?
I know mul!
has warning against some of the arguments aliasing the same memory as other arguments, but that is clearly not happening here.