Memory Allocation when using mul! with sparse arrays and views

I am writing code which performs a five argument mul! operation (from LinearAlgebra) inside a for loop where performance is critical.

When using a sparse array instead of a matrix, the operation allocates memory:

julia> using SparseArrays, LinearAlgebra

julia> A = rand(25,25); B = rand(25); C = rand(25);

julia> A_sparse = SparseMatrixCSC(A);

julia> @time mul!(C, A, B, 1, 1);
  0.137919 seconds (236.09 k allocations: 16.206 MiB, 32.65% gc time, 99.98% compilation time)

julia> @time mul!(C, A, B, 1, 1);
  0.000006 seconds

julia> @time mul!(C, A_sparse, B, 1, 1);
  0.514963 seconds (1.77 M allocations: 121.304 MiB, 5.29% gc time, 99.98% compilation time)

julia> @time mul!(C, A_sparse, B, 1, 1);
  0.000007 seconds (2 allocations: 80 bytes)

The allocation goes away when the mul! is performed in a loop inside a function:

julia> function f(A, B, C)
         for i in 1:100
           mul!(C, A, B, 1, 1)
         end
       end
f (generic function with 1 method)

julia> @time f(A,B,C);
  0.011613 seconds (3.78 k allocations: 253.445 KiB, 99.70% compilation time)

julia> @time f(A,B,C);
  0.000017 seconds

julia> @time f(A_sparse,B,C);
  0.004931 seconds (6.90 k allocations: 486.984 KiB, 98.89% compilation time)

julia> @time f(A_sparse,B,C);
  0.000039 seconds

julia> @time f(A_sparse,B,C);
  0.000039 seconds

However, there is allocation when the matrix is sparse and the vectors are views:

julia> Bview = @view B[:];

julia> Cview = @view C[:];

julia> @time f(A_sparse,Bview,Cview);
  0.320725 seconds (1.36 M allocations: 91.655 MiB, 24.13% gc time, 99.98% compilation time)

julia> @time f(A_sparse,Bview,Cview);
  0.000039 seconds (100 allocations: 4.688 KiB)

On the other hand, there is not allocation when the matrix is dense and the vectors are views:

julia> @time f(A,Bview,Cview);
  0.092996 seconds (179.02 k allocations: 12.220 MiB, 99.96% compilation time)

julia> @time f(A,Bview,Cview);
  0.000016 seconds

Could someone explain to me why the matrix being sparse and the vectors being views causes memory allocation, and how I could avoid this?

I know mul! has warning against some of the arguments aliasing the same memory as other arguments, but that is clearly not happening here.

I think this might be the same as

However, note that to benchmark functions properly, you need to use BenchmarkTools.jl and interpolate variables with a dollar sign. Otherwise your allocation count will be biased by the presence of global variables.

1 Like

Related GitHub issue:

Apparently it’s fixed on 1.11 but I haven’t checked

2 Likes

I think you are right that that is the issue. My call stack looks very similar to the one they show for 1.10. I’ll roll back to 1.9 and see if that fixes is (I would rather not move forward to a version in pre-alpha to fix a bug)

One thing I discovered today which might help you. MKLSparse.jl removed the allocations for me. Also it provided a close to 2x speed up for my usecase.