Exploring Quadratic forms with Julia

Hi friends,
I have published a blog post on Exploring Quadratic forms with Julia
Just tried to make my understanding on Quadratic forms clearer using Julia.
Please give your feedback on improvements/efficient code etc.

2 Likes

A few comments:
For convenience, I copy&paste the code from Forem here:

using LinearAlgebra
function getDefiniteness(A)

    @assert isequal(A,A') "Matrix is not symmetric"

    if true == isposdef(A)
        return "Matrix is positive definite"
    end

    x,y = eigmin(A),eigmax(A)

    if iszero(x)
        return "Matrix is positive semi definite"
    end

    if y > 0.0 && x < 0.0
        return "Matrix is indefinite"
    end

    if iszero(y)
        return "Matrix is negative semi definite"
    end

    if all([x, y] .< 0)
        return "Matrix is negative definite"
    end

end

A = rand(5,5)
A = (A+A')/2

A = [-1 0;0 -1.0]
getDefiniteness(A)

First, you do not have to write

if true == isposdef(A)

Instead, you can write directly

if isposdef(A)

Second, since you compute eigenvalues for your matrix in order to distinguish between definiteness and indefiniteness, you can skip the the isposdef() computation altogether to save some computational effort (should that be an issue in a particular situation). Once you have the eigenvalues, you can use them for checking indefiniteness too.

Third, in finite-precision arithmetics, it rarely makes any sense to check if something is exactly zero as you do with

if iszero(x)

Consider the following example

A = [1e-100 0;0 1.0]

The matrix is classified as positive definite by your function

julia> getDefiniteness(A)
"Matrix is positive definite"

while it is effectively/practically singular (positive semidefinite) for all intents and purposes.

Finally, although I am not familiar with the implementation details of eigmin, by measuring the execution times and seeing that they are about the same as those for eigvals, I suspect that all eigenvalues are computed even if you only request the smallest one. I conclude that instead of calling eigmin(A) and then eigmax(), it is perhaps more efficient to call eigvals just once…

1 Like

You can just use ishermitian(A). And after that I would set H = Hermitian(A) and use H instead of A (so that Julia will know to use more efficient algorithms).

And normally you would not use @assert for this — you should assume that @assert is something that might be removed in production code, and it should only be used to check for bugs. Throw an exception instead, e.g.

ishermitian(A) || throw(ArgumentError("matrix must be Hermitian"))

It is more idiomatic to simply do if isposdef(A) (or isposdef(H) if you use the Hermitian wrapper).

This allocates two arrays ([x, y] is one array, and [x, y] .< 0 is another array), just to check if two numbers are negative. Putting everything into an array, even if it is just two scalars, is kind of a Matlab anti-pattern. Just check if x < 0 && y < 0.

I would do the opposite: if you want to check whether a matrix is negative definite, it’s probably more efficient to call isposdef(-A) or isposdef(-H) than to look at eigenvalues. The reason is that you can check positive-definiteness using only a Cholesky factorization, which is much cheaper than computing eigenvalues.

For example, with a 1000 \times 1000 negative-definite matrix:

julia> A = randn(1000,1000); A = -A'A; H = Hermitian(A);

julia> @btime isposdef($H) || isposdef(-$H);
  5.998 ms (6 allocations: 22.89 MiB)

julia> @btime eigvals($H);
  42.406 ms (11 allocations: 7.99 MiB)

julia> @btime (eigmin($H), eigmax($H));
  67.561 ms (22 allocations: 15.96 MiB)

julia> @btime eigmin($H);
  31.952 ms (11 allocations: 7.98 MiB)

Notice that checking both isposdef(H) and isposdef(-H) is 7x faster than computing eigvals(H).

No (at least not for a Hermitian matrix). Notice that for my example above, eigmin(H) is about 25% faster than eigvals(H). However, it is still slower to call both eigmin and eigmax, since they don’t share any computations.

In principle, it should be possible to compute both eigmin and eigmax more efficiently than computing all the eigenvalues, but you need to share the Hessenberg/tridiagonal reduction calculation (since that is the first step of any eigenvalue calculation, and is a big part of the cost):

function eigminmax(H::Hermitian)
    T = hessenberg(H).H # real SymTridiagonal reduction
    return eigmin(T), eigmax(T)
end

which gives

julia> @btime eigminmax($H);
  32.574 ms (65 allocations: 8.27 MiB)

which is almost twice as fast as calling (eigmin(H), eigmax(H)), about the same speed as eigmin(H), and 25% faster than eigvals(H). However, it is still much slower than isposdef(H) (which computes no eigenvalues at all).

9 Likes

I perfectly agree that Cholesky-decomposition-based isposdef(H) is the way to go for checking positive definiteness of H (and negative definiteness of -H). But in the OP’s code from Forem that I pasted here, it is not only positive definiteness and negative definiteness, but also positive semidefiniteness, negative semidefiniteness and indefiniteness that must be determined. I think this imposes the need to compute eigenvalues, doesn’t it? I do not know how I can distinguish between (strict) positive semidefiniteness and nondefiniteness when detecting the failure to compute the Cholesky decomposition. Then my recommendation was simply: once you are forced to compute the eigenvalues, you can use them to check for pd-ness and nd-ness too.

1 Like

You don’t need eigenvalues. If it’s not positive-definite or negative-definite, you should assume it’s indefinite.

You can’t check for semi-definiteness numerically, because roundoff errors usually make it impossible to distinguish a zero eigenvalue from one that is slightly positive or slightly negative.

2 Likes

In addition, there are some pathological cases where ishermitian is much faster than isequal, e.g.

julia> using LinearAlgebra, SparseArrays, BenchmarkTools

julia> let A = sprandn(1000, 1000, 0.01);
           H = A + A'
           @btime ishermitian($H)
           @btime $H == $H'
       end;
  24.727 μs (1 allocation: 8.00 KiB)
  520.155 μs (0 allocations: 0 bytes)

I’ve been bit by this before with really large, very sparse matrices where I ended up spending the majority of my computation time doing what I thought was a very cheap verification that my array was Hermitian.

julia> let A = sprandn(100000, 100000, 0.001);
           H = A + A'
           @btime ishermitian($H)
           @btime $H == $H'
       end;
  97.176 ms (2 allocations: 781.36 KiB)
  1.974 s (0 allocations: 0 bytes)
1 Like

I understand the argument, but isn’t it the same situation as in determining the rank of a matrix? Say, if we want to determine if a matrix is singular or not. Strictly speaking, we would (effectively) never be able to detect a singular matrix numerically, wouldn’t we?

For example, is this matrix

julia> H = [-1e-20 0; 0 1]
2×2 Matrix{Float64}:
 -1.0e-20  0.0
  0.0      1.0

singular?

julia> rank(H)
1

For practical purposes, shall we regard it as indefinite or positive semi-definite?

1 Like
julia> @btime a,b = eigmin(A),eigmax(A)
  3.900 μs (23 allocations: 5.25 KiB)
(-1.4340659161787122, 5.636707875939123)

julia> @btime a = eigvals(A)
  1.880 μs (10 allocations: 2.59 KiB)
5-element Vector{Float64}:
 -1.4340659161787122
 -0.4391186490192085
  0.15889223870067884
  1.5551498036607247
  5.636707875939123

julia> @btime a = eigvals(A)
  1.880 μs (10 allocations: 2.59 KiB)
5-element Vector{Float64}:
 -1.4340659161787122
 -0.4391186490192085
  0.15889223870067884
  1.5551498036607247
  5.636707875939123

julia> @btime a,b = eigmin(A),eigmax(A)
  3.913 μs (23 allocations: 5.25 KiB)
(-1.4340659161787122, 5.636707875939123)

eigvals is efficient for a 5x5 matrix.

eigmin and eigmax seems to call eigvals internally, so this makes sense (and should be true regardless of the size of matrix).

Right. You can only compute the “numerical rank” to some tolerance instead. See this discussion and references therein.

In numerical computations, it’s more natural to talk about the condition number of a matrix, which indicates “how singular” a matrix is (and generalizes to non-square cases). Even if a matrix is non-singular, if it is close enough to singular then you have to approach many calculations with care.

cond(H) ≈ 1e20, so for practical purposes you regard it as ill-conditioned.

1 Like

They call eigvals but (for Hermitian/real-symmetric matrices) pass the optional irange argument to compute only a subset of the eigenvalues. This saves some computation time, although the bulk of the time is spent on the Hessenberg/tridiagonal reduction as I commented above (so you ideally want to re-use that if you want both eigmin and eigmax).

LAPACK isn’t very well optimized for tiny matrices like 5x5 so benchmarks are mostly measuring overheads; in principle you can do much better for tiny matrices, e.g. using StaticArrays if the sizes are fixed.

1 Like

To some extent you can determine if a Hermitian matrix is close to semidefinite by computing the Cholesky decomposition with diagonal pivoting P^T H P = L L^T and using the largest magnitude pivot on the diagonal in each step. If your largest available pivot at step r is small, the matrix is correspondingly close to a semidefinite matrix of rank r. There is an analysis of the procedure in this paper. The bounds on what you have to truncate aren’t as strong as with eigenvalues, but it works well in practice. However, I don’t know of an optimized implementation and I would guess that doing a lazy implementation would be slower than computing eigenvalues using LAPACK. As with unpivoted Cholesky, if you have a negative pivot, the matrix is close to an indefinite matrix.

2 Likes