Symmetric matrix operations not faster

I was expecting that if a matrix is defined symmetric, some operations, e.g., solving, should be faster. But using the example in the manual:

using LinearAlgebra, BenchmarkTools
B = [1.5 2 -4; 2 -1 -3; -4 -3 5]
sB = Symmetric(B)
x = [1, 2 3]
@btime B \ x #=>307ns
@btime sB \ x #=> 545 ns

Am I doing something wrong?

It looks to me like lu on symmetric matrices actually do extra work, because they assume that the wrapped array is not symmetric and therefore need to first copy every element across the diagonal. And at the same time do not benefit from knowing about symmetry beyond that.

Surprisingly, symmetric static arrays do not work at all:

jl> using LinearAlgebra, StaticArrays

jl> B = SA[1.5 2 -4; 2 -1 -3; -4 -3 5];

jl> sB = Symmetric(B);

jl> x = SA[1, 2, 3];

jl> sB \ x
ERROR: setindex!(::SMatrix{3, 3, Float64, 9}, value, ::Int) is not defined.

Unfortunately, it seems like \ will generally fail for immutable matrices, unless they have a special implementation. So \ works for SMatrix, but not for Symmetric{.., SMatrix}, etc.

1 Like

It is hard to tell anything about the speed of solving a linear system from a 3 by 3 example - the overhead of checking the system and dispatching is probably greater than the amount of time spent doing the calculations.

Also, the LU decomposition and solution has been optimized and tuned over many, many years. I think it was the Lapack benchmark for supercomputers so it was worthwhile wringing every ounce of speed out of it. Methods for symmetric systems like the Bunch-Kaufman decomposition or the Cholesky factorization have not received the same level of intensive optimization.

It is even true for matrix multiplication. The function BLAS.gemm! performs an in-place general matrix multiplication whereas BLAS.syrk! takes advantage of symmetry in evaluation of a product like A'A and performs roughly half the number of operations.

julia> const A = rand(1000, 20);

julia> const C = similar(A, (20, 20));

julia> using BenchmarkTools, LinearAlgebra

julia> @btime BLAS.gemm!('T', 'N', 1.0, A, A, 0.0, C);
  7.541 μs (0 allocations: 0 bytes)

julia> Ccp = copy(C);

julia> @btime BLAS.syrk!('U', 'T', 1.0, A, 0.0, C);
  7.579 μs (0 allocations: 0 bytes)

julia> Ccp ≈ C
true

So doing half the number of operations ends up being a little bit slower, probably because gemm has been optimized more than syrk has.

2 Likes

Yes, but my impression from perusing the code was that backslash actually doesn’t take advantage of the symmetry, but does extra work to copy elements across the diagonal. So it’s actually strictly slower.

This assumes that I found the right method definitions, though…

I poked around with Debugger.jl and found that sB \ x ends up here: https://github.com/JuliaLang/julia/blob/3d224214ee81a5eec39711d02ba225144a31efd9/stdlib/LinearAlgebra/src/symmetric.jl#L641 and does ineed call bunchkaufman rather than lu.

2 Likes

I should have used the debugger.

f1 () = begin
    t = rand(1000, 1000)
    t += 100I
    b = ones(1000)
end
@btime f1();
# 1.111 ms (4 allocations: 15.26 MiB)

f2 () = begin
    t = rand(1000, 1000)
    t += 100I
    b = ones(1000)
    LAPACK.posv!('L', t, b)
end
@btime f2();
# 4.723 ms (5 allocations: 15.27 MiB)

t = rand(1000, 1000)
@btime t \ ones(1000)
# 6.188 ms (5 allocations: 7.65 MiB)

About half of the time saved.