Hi friends,
I have published a blog post on Exploring Quadratic forms with Julia
Just tried to make my understanding on Quadratic forms clearer using Julia.
Please give your feedback on improvements/efficient code etc.
A few comments:
For convenience, I copy&paste the code from Forem here:
using LinearAlgebra
function getDefiniteness(A)
@assert isequal(A,A') "Matrix is not symmetric"
if true == isposdef(A)
return "Matrix is positive definite"
end
x,y = eigmin(A),eigmax(A)
if iszero(x)
return "Matrix is positive semi definite"
end
if y > 0.0 && x < 0.0
return "Matrix is indefinite"
end
if iszero(y)
return "Matrix is negative semi definite"
end
if all([x, y] .< 0)
return "Matrix is negative definite"
end
end
A = rand(5,5)
A = (A+A')/2
A = [-1 0;0 -1.0]
getDefiniteness(A)
First, you do not have to write
if true == isposdef(A)
Instead, you can write directly
if isposdef(A)
Second, since you compute eigenvalues for your matrix in order to distinguish between definiteness and indefiniteness, you can skip the the isposdef()
computation altogether to save some computational effort (should that be an issue in a particular situation). Once you have the eigenvalues, you can use them for checking indefiniteness too.
Third, in finite-precision arithmetics, it rarely makes any sense to check if something is exactly zero as you do with
if iszero(x)
Consider the following example
A = [1e-100 0;0 1.0]
The matrix is classified as positive definite by your function
julia> getDefiniteness(A)
"Matrix is positive definite"
while it is effectively/practically singular (positive semidefinite) for all intents and purposes.
Finally, although I am not familiar with the implementation details of eigmin
, by measuring the execution times and seeing that they are about the same as those for eigvals
, I suspect that all eigenvalues are computed even if you only request the smallest one. I conclude that instead of calling eigmin(A)
and then eigmax()
, it is perhaps more efficient to call eigvals
just onceā¦
You can just use ishermitian(A)
. And after that I would set H = Hermitian(A)
and use H
instead of A
(so that Julia will know to use more efficient algorithms).
And normally you would not use @assert
for this ā you should assume that @assert
is something that might be removed in production code, and it should only be used to check for bugs. Throw an exception instead, e.g.
ishermitian(A) || throw(ArgumentError("matrix must be Hermitian"))
It is more idiomatic to simply do if isposdef(A)
(or isposdef(H)
if you use the Hermitian wrapper).
This allocates two arrays ([x, y]
is one array, and [x, y] .< 0
is another array), just to check if two numbers are negative. Putting everything into an array, even if it is just two scalars, is kind of a Matlab anti-pattern. Just check if x < 0 && y < 0
.
I would do the opposite: if you want to check whether a matrix is negative definite, itās probably more efficient to call isposdef(-A)
or isposdef(-H)
than to look at eigenvalues. The reason is that you can check positive-definiteness using only a Cholesky factorization, which is much cheaper than computing eigenvalues.
For example, with a 1000 \times 1000 negative-definite matrix:
julia> A = randn(1000,1000); A = -A'A; H = Hermitian(A);
julia> @btime isposdef($H) || isposdef(-$H);
5.998 ms (6 allocations: 22.89 MiB)
julia> @btime eigvals($H);
42.406 ms (11 allocations: 7.99 MiB)
julia> @btime (eigmin($H), eigmax($H));
67.561 ms (22 allocations: 15.96 MiB)
julia> @btime eigmin($H);
31.952 ms (11 allocations: 7.98 MiB)
Notice that checking both isposdef(H)
and isposdef(-H)
is 7x faster than computing eigvals(H)
.
No (at least not for a Hermitian matrix). Notice that for my example above, eigmin(H)
is about 25% faster than eigvals(H)
. However, it is still slower to call both eigmin
and eigmax
, since they donāt share any computations.
In principle, it should be possible to compute both eigmin
and eigmax
more efficiently than computing all the eigenvalues, but you need to share the Hessenberg/tridiagonal reduction calculation (since that is the first step of any eigenvalue calculation, and is a big part of the cost):
function eigminmax(H::Hermitian)
T = hessenberg(H).H # real SymTridiagonal reduction
return eigmin(T), eigmax(T)
end
which gives
julia> @btime eigminmax($H);
32.574 ms (65 allocations: 8.27 MiB)
which is almost twice as fast as calling (eigmin(H), eigmax(H))
, about the same speed as eigmin(H)
, and 25% faster than eigvals(H)
. However, it is still much slower than isposdef(H)
(which computes no eigenvalues at all).
I perfectly agree that Cholesky-decomposition-based isposdef(H)
is the way to go for checking positive definiteness of H
(and negative definiteness of -H
). But in the OPās code from Forem that I pasted here, it is not only positive definiteness and negative definiteness, but also positive semidefiniteness, negative semidefiniteness and indefiniteness that must be determined. I think this imposes the need to compute eigenvalues, doesnāt it? I do not know how I can distinguish between (strict) positive semidefiniteness and nondefiniteness when detecting the failure to compute the Cholesky decomposition. Then my recommendation was simply: once you are forced to compute the eigenvalues, you can use them to check for pd-ness and nd-ness too.
You donāt need eigenvalues. If itās not positive-definite or negative-definite, you should assume itās indefinite.
You canāt check for semi-definiteness numerically, because roundoff errors usually make it impossible to distinguish a zero eigenvalue from one that is slightly positive or slightly negative.
In addition, there are some pathological cases where ishermitian
is much faster than isequal
, e.g.
julia> using LinearAlgebra, SparseArrays, BenchmarkTools
julia> let A = sprandn(1000, 1000, 0.01);
H = A + A'
@btime ishermitian($H)
@btime $H == $H'
end;
24.727 Ī¼s (1 allocation: 8.00 KiB)
520.155 Ī¼s (0 allocations: 0 bytes)
Iāve been bit by this before with really large, very sparse matrices where I ended up spending the majority of my computation time doing what I thought was a very cheap verification that my array was Hermitian.
julia> let A = sprandn(100000, 100000, 0.001);
H = A + A'
@btime ishermitian($H)
@btime $H == $H'
end;
97.176 ms (2 allocations: 781.36 KiB)
1.974 s (0 allocations: 0 bytes)
You canāt check for semi-definiteness numerically, because roundoff errors usually make it impossible to distinguish a zero eigenvalue from one that is slightly positive or slightly negative.
I understand the argument, but isnāt it the same situation as in determining the rank of a matrix? Say, if we want to determine if a matrix is singular or not. Strictly speaking, we would (effectively) never be able to detect a singular matrix numerically, wouldnāt we?
For example, is this matrix
julia> H = [-1e-20 0; 0 1]
2Ć2 Matrix{Float64}:
-1.0e-20 0.0
0.0 1.0
singular?
julia> rank(H)
1
For practical purposes, shall we regard it as indefinite or positive semi-definite?
julia> @btime a,b = eigmin(A),eigmax(A)
3.900 Ī¼s (23 allocations: 5.25 KiB)
(-1.4340659161787122, 5.636707875939123)
julia> @btime a = eigvals(A)
1.880 Ī¼s (10 allocations: 2.59 KiB)
5-element Vector{Float64}:
-1.4340659161787122
-0.4391186490192085
0.15889223870067884
1.5551498036607247
5.636707875939123
julia> @btime a = eigvals(A)
1.880 Ī¼s (10 allocations: 2.59 KiB)
5-element Vector{Float64}:
-1.4340659161787122
-0.4391186490192085
0.15889223870067884
1.5551498036607247
5.636707875939123
julia> @btime a,b = eigmin(A),eigmax(A)
3.913 Ī¼s (23 allocations: 5.25 KiB)
(-1.4340659161787122, 5.636707875939123)
eigvals is efficient for a 5x5 matrix.
eigmin
and eigmax
seems to call eigvals
internally, so this makes sense (and should be true regardless of the size of matrix).
I understand the argument, but isnāt it the same situation as in determining the rank of a matrix? Say, if we want to determine if a matrix is singular or not. Strictly speaking, we would (effectively) never be able to detect a singular matrix numerically, wouldnāt we?
Right. You can only compute the ānumerical rankā to some tolerance instead. See this discussion and references therein.
In numerical computations, itās more natural to talk about the condition number of a matrix, which indicates āhow singularā a matrix is (and generalizes to non-square cases). Even if a matrix is non-singular, if it is close enough to singular then you have to approach many calculations with care.
H = [-1e-20 0; 0 1]
[ā¦] For practical purposes, shall we regard it as indefinite or positive semi-definite?
cond(H) ā 1e20
, so for practical purposes you regard it as ill-conditioned.
eigmin
andeigmax
seems to calleigvals
internally, so this makes sense (and should be true regardless of the size of matrix).
They call eigvals
but (for Hermitian/real-symmetric matrices) pass the optional irange
argument to compute only a subset of the eigenvalues. This saves some computation time, although the bulk of the time is spent on the Hessenberg/tridiagonal reduction as I commented above (so you ideally want to re-use that if you want both eigmin
and eigmax
).
LAPACK isnāt very well optimized for tiny matrices like 5x5 so benchmarks are mostly measuring overheads; in principle you can do much better for tiny matrices, e.g. using StaticArrays if the sizes are fixed.
To some extent you can determine if a Hermitian matrix is close to semidefinite by computing the Cholesky decomposition with diagonal pivoting P^T H P = L L^T and using the largest magnitude pivot on the diagonal in each step. If your largest available pivot at step r is small, the matrix is correspondingly close to a semidefinite matrix of rank r. There is an analysis of the procedure in this paper. The bounds on what you have to truncate arenāt as strong as with eigenvalues, but it works well in practice. However, I donāt know of an optimized implementation and I would guess that doing a lazy implementation would be slower than computing eigenvalues using LAPACK. As with unpivoted Cholesky, if you have a negative pivot, the matrix is close to an indefinite matrix.