I agree that 2 and 3 will be less confusing than the current state, but that doesn’t make them more correct. And yes any of the solutions from 2 to 6 above would have saved me a few hours of debugging.
I have never seen (2) anywhere, do you have a reputable reference for this? (1) is what I know as the textbook definition.
Just to clarify: I think symmetry should NOT be part of the definition of positivity.
For non-Hermitian arguments to isposdef
you enumerate
In principle I prefer the less restricted definition (on the grounds of semantic or pedantic coherence, as well as your openness) and therefore (5). For the record, checking symmetry has the same cost as extracting the symmetric part, so efficiency does not figure here. (The methods for (1) must either make a copy or destroy the input.) Anyone who needs it could implement such a function.
But as we’ve seen, there is disagreement even among experts about the definition of “positive definite”. So IMHO the developers can pick the definition they like and document it clearly; with that caveat I don’t object to (1). (Good thing, since it’s nearly cast in magnetic domains: PR here.)
Regarding
Note that exact symmetry is easy to enforce, so I’d say it’s fair to use it to determine a choice of algorithm and the types of results (e.g. for eigenvalues). Therefore also the result here, if the restrictive definition of P.D. is used.
On a related note, I will be interested to learn if wrappers like Hermitian(A)
are practically universal replacements for
A = (A + ctranspose(A))/2
which I have used many times in the past.
I’m happy to be told otherwise but I actually doubt that experts disagree. Rather, I think some lecturers just for the sake of convenience considered only the hermitian case.
Afaik the wrappers in LinearAlgebra are lazy wrappers that use either the upper or lower triangle. Not sure how the internal algorithms later on are implemented; just noting that this means that we have no cache-friendly way of iterating over a column of a wrapped matrix, which may or may not be expensive.
I have not seen (2) either. I stated this only because I interpreted that this was the definition that some people wanted in the earlier posts, and I wanted to point out that it is in violation with x'*A*x > 0
for all non-zero x
, which is the only definition I ever came across.
I do still think that, if Julia wants to use the definition x'*A*x > 0
, then it should not assume x
is real just because A
happens to be real. Julia functions tend to be generic, and so in that case A
is hermitian or symmetric automatically follows from the definition by including complex x
irrespective of eltype(A)
. So I certainly don’t agree with
Rather, I think some lecturers just for the sake of convenience considered only the hermitian case.
If the extension to the non-symmetric case/non-hermitian case is wanted, then the definition in the docs should change to real(x'*A*x) > 0
. But that definitely seems to be a more specific/lesser known definition.
a=Complex(3);
a>0
#ERROR: MethodError: no method matching isless(::Int64, ::Complex{Int64})
Complex numbers are never positive in julia. Like e.g. x'*A*x
for complex x
or A
. So julia is definitely not using positivity of x'*A*x
, despite all claims in the documentation.
I would consider it a pretty brutal footgun if Base.isless(x::Complex, y::Complex)=(iszero(x.im) && iszero(y.im) && isless(x.re, y.re))
. This is essentially the current behavior of isposdef
.
edit: Literally.
A= Matrix{Complex64}(1,1);
A[1,1]=1.0;
isposdef(A)
#true
A += Complex(0.0, 2.0^(-1000));
isposdef(A)
#false
Unfortuantely, isposdef
has been defined also for complex number in this way:
isposdef(x::Number) = imag(x)==0 && real(x) > 0
which is in line with the definition for matrices. So there is no consistent relation between <
/ isless
for real numbers and isposdef
for numbers in general and matrices. For me, that is a non-mathematical approach - as is the current implementation of isposdef
. I fear, we are riding a dead cow here, however. Accidents with shoot guns seem to be daily life in certain civilized countries, though.
Argh! And it is crazy, because julia is otherwise pretty consistent with
convert(Real, Complex64(3,0)) == 3.0f0
convert(Real, Complex64(3.0,2.0^(-20)))
#ERROR: InexactError: Real(Real, 3.0f0 + 9.536743f-7im)
Isn’t it that “complex numbers are never positive”, period (full stop)? Which I guess explains taking the real component before checking for positivity of the quadratic form that returns a complex number, in the Wolfram definition. I hope I am not rusty on basic math so correct me if I am missing something.
Is Complex(3,0)
a complex number? Is is a real number?
If you consider the reals as a subset of the complex, then it is both, and larger than zero. If you have a type system (like julia), then it is a complex and not a real number; it is neither larger-than-zero nor not-larger-than-zero; its larger-than-zero-ness is MethodError.
Most mathematicians consider Complex(3,0)
as both real and complex. It is larger than zero. Generally, complex numbers can be larger-than-zero, not-larger-than-zero or not-applicable ( MethodError
); that is, they compare like convert(Real,x)
, and throw an InexactError
if there is a remaining imaginary part (only that we need to give a proof that this never happens).
Julia takes a pretty inconsistent approach: A complex number is never larger-than-zero (comparison is not well-defined modulo isequal; isequal is dwim; meh, ok). If the number happens to have zero imaginary part and the real part is positive, then it is isposdef
. Ok, makes sense: isposdef(x)=isposdef(convert(Real,x))
. If it happens to have non-vanishing imaginary part then… dumroll… it is not-isposdef, instead of InexactError
. Argh!
Yes this is another pivot point. I guess what I was trying to say is that if we commit to allowing complex numbers in x and A, then the restricted definition’s quadratic form will be a complex number which happens to have a 0 imaginary component because A is Hermitian. This means that an equivalent (less ambiguous) definition would be “A is Hermitian and real(x^*Ax) > 0”. This naturally generalizes to the Wolfram definition when the first condition is dropped much like the real matrices’ 2 definitions. The ambiguity comes from > not being defined for general complex numbers especially that computational error can easily introduce a small imaginary part. So this means that only matrices whose quadratic form’s imaginary part is computed in exact arithmetic can be positive definite, or the equivalent in the Julia case and the restricted definition, that matrices have to be perfectly Hermitian which again seems easy to break, that is even a small perturbation can make isposdef
return false. My comment is still missing a valid reference to support it but I am just sharing some thoughts on Wolfram’s definition.
Well, my latest find is R. Bhatia’s book Positive Definite Matrices. Bhatia uses the definition x^{*}Ax > 0\, \forall\, x\ne 0,\, x\in \mathbb{C}^n. The implication that A is Hermitian is asserted on the first page. He seems to assume silently that non-real complex numbers are simply not positive.
Amusingly, I was led to Bhatia by Golub & van Loan, who concentrate on \mathbb{R}, leading to the alternative implication, which @mohamed82008 and I prefer, that only the symmetric part of A counts, and a logical extension to \mathbb{C} is the right half plane.
More amusingly, Prof. Bhatia was apparently an editor for the SIAM journal which published the papers I mentioned on “non-Hermitian positive definite matrices”.
Here is a bunch of more references. The strongest statement on non-Hermitian matrices is by Householder.
-
Ayres, F. Jr. Schaum’s Outline of Theory and Problems of Matrices. New York: Schaum, p. 134, 1962.
- “A real non-singular quadratic form q = X'AX, |A| \neq 0”, in n variables is called positive definite if its rank and index are equal."
- “The matrix A of a real quadratic form q = X'AX is called definite or semi-definite according as the quadratic form is definite or semi-definite.”
-
Marcus, M. and Minc, H. Introduction to Linear Algebra. New York: Dover, p. 182, 1965.
- “A hermitian matrix, or a hermitian transformation, is called positive definite if all its characteristic roots are positive. It is said to be positive semidefinite if all its characteristic roots are nonnegative. A positive semidefinite hermitian matrix or transformation is also called nonnegative hermitian.”
-
Alston S. Householder. The Theory of Matrices in Numerical Analysis. Blaisdell Publishing Company, 1964
- “A Hermitian matrix A is said to be positive definite in case x \neq 0 \rightarrow x^HAx > 0; it is nonnegative semidefinite in case x^HAx \geq 0 for every x. For non-Hermitian matrices the concept is not defined.”
I think Householder would have voted for throwing an error then!
Not at all, I find it informative. I am just curious to see a real-world computational (not theoretical) example where one would test for positive definiteness, but not use the resulting factorization.
Usually, PD is not an end in itself, but a condition for using a factorization that reveals something useful about the structure of a matrix. At the same time, checking for PD itself usually involves a factorization, so the two are intertwined. This approach is embodied in eg
and the packages that use it.