Fast computation of Grammian of matrix

BambOoxX · March 1, 2021, 11:52am

For some model indentification problems, I have to compute the grammian of data matrices i.e. compute A = X'*X where X is a complex rectangular matrix.

Is there a fast implementation of this specific operation in julia knowing that the output matrix is hermitian ?

A first basic assessment would assume this could split in half the computation times as well as the memory requirements. Also this could reduce numerical error propagation leading to a non heremitian numeric output.

juliohm · March 1, 2021, 12:00pm

Try to search for “kernel matrices julia” and you may find something useful. For example, I found KernelMatrices.jl by @cgeoga

You may also find something if you search for “quadratic form julia”, I don’t know.

FMeirinhos · March 1, 2021, 12:30pm

Some quick googling finds the Julia benevolent overlords had already gotten you covered almost 10 years ago
https://github.com/JuliaLang/julia/issues/659

using BenchmarkTools
a = rand(100,100); b = rand(100,100);

@benchmark $a' * $a
# median time:      110.190 μs (0.00% GC)

@benchmark $b' * $a
# median time:      836.717 μs (0.00% GC)

zdenek_hurak · March 1, 2021, 12:34pm

Just a remark: it is often advantageous from a numerical viewpoint to use algorithms that do not require forming the gramian and instead work with its square root (Cholesky) factor.

BambOoxX · March 1, 2021, 12:48pm

@zdenek_hurak could you expand on your comment ? What type of algorithms do you refer to ?

BambOoxX · March 1, 2021, 12:49pm

@FMeirinhos thanks, sometimes it’s just a matter of keywords !

zdenek_hurak · March 1, 2021, 1:02pm

I meant something in the line of square-root balancing used for model order reduction (Redirecting or Balancing free square-root algorithm for computing singular perturbation approximations | IEEE Conference Publication | IEEE Xplore and certainly described elsewhere). Perhaps it may be relevant for you problem, maybe not. Just in case, the author has been actively developing MatrixEquations.jl package. Have a look at it, perhaps you may find something relevant there.

stevengj · March 1, 2021, 1:45pm

Typically you should look at the minimum time rather than the median, since timing noise is always positive. (I usually use @btime, which only reports the minimum.) When you do that, the conclusion is the opposite on my computer:

julia> @btime $a' * $a;
  49.545 μs (2 allocations: 78.20 KiB)

julia> @btime $b' * $a;
  30.643 μs (2 allocations: 78.20 KiB)

Similarly for a larger matrix:

julia> @btime $A' * $A;
  342.402 μs (2 allocations: 78.20 KiB)

julia> @btime $B' * $A;
  246.269 μs (2 allocations: 78.20 KiB)

It looks like the culprit is threading — it seems that dgemm is better-parallelized than dysrk in OpenBLAS, probably because the computation in dgemm is more regular. If I change to single-threaded BLAS then dsyrk is faster in both cases:

julia> LinearAlgebra.BLAS.set_num_threads(1)

julia> @btime $A' * $A;
  487.042 μs (2 allocations: 78.20 KiB)

julia> @btime $B' * $A;
  602.739 μs (2 allocations: 78.20 KiB)

julia> @btime $a' * $a;
  58.816 μs (2 allocations: 78.20 KiB)

julia> @btime $b' * $a;
  61.733 μs (2 allocations: 78.20 KiB)

BambOoxX · March 1, 2021, 2:02pm

Thanks, I am not sure I fully understand the whole process yet, but it may be useful later.

cgeoga · March 1, 2021, 4:56pm

To step back a little bit, do you actually need to compute that matrix? If you just need a few entries or to apply it to vectors or something, something like this pseudo-julia might be helpful:

struct GramMatrix{T} 
  X::Matrix{T} # Or perhaps a QR factorization of X?
end

Base.getindex(G::GramMatrix{T}, j::Int64, k::Int64) where{T} = # small inner product of j/k slices
Base.*(G::GramMatrix{T}, x::Vector{T}) where{T} = # manual two-part matrix multiplication
Base.\(G::GramMatrix{T}, x::Vector{T}) where{T} = ... #, perhaps using IterativeSolvers.jl with your fast matvec

# ....

Of course if you need the whole matrix then this won’t be enough, but particularly for very rectangular X you could potentially stand to save a lot with just a few lines of code like this if you only need something simple.

BambOoxX · April 2, 2021, 3:31pm

Sorry for the (very) late reply !

Actually, I am computing some reduced normal equations from covariance and cross covariance matrices.
So I have some CovX = X'*X,CrcovXY = X'*Y and CovY = Y'*Y matrices, which are then assembled as
Z = CovY - CrcovXY'*CovX*CrcovY.

Of course, this gets me in much numerical trouble and there seems to be a lot of error propagation.
For instance, though Z should theoretically be Hermitian, it is numerically not.

BambOoxX · April 6, 2021, 1:16pm

I read the articles you mention a bit more thoroughly. It seems that the available algorithms to perform the computation of the Cholesky decomposition of the Gramian matrix require to have a realization of a system (whether in continuous or discrete time). Because of that, I do not see how to take advantage of this (yet promising otherwise) approach.

Topic		Replies	Views
Faster way for matrix multiplication including a Hermitian matrix with special properties New to Julia quantum , linearalgebra , optimization	6	1156	August 26, 2023
Sparse Cholesky of Gram Matrix (SuiteSparse?) Numerics question , linearalgebra	25	2598	April 11, 2020
In-place computation of Q (from QR decomposition) General Usage	35	2519	November 30, 2022
Large Matrix Takes Forever Performance linearalgebra	4	1043	March 4, 2019
Why is matrix times matrix transpose not a Symmetric / Hermitian matrix General Usage	12	1290	April 10, 2022

Fast computation of Grammian of matrix

Related topics