Gram-Schmidt decomposition

pafloxy · January 29, 2022, 2:58pm

Hi! Is there any specific module in Julia for performing linear algebra operations like Gram-Schmidt orthogonalization for a given set of vectors?

stevengj · January 29, 2022, 3:02pm

qr in the LinearAlgebra standard library does this.

(Technically, Gram–Schmidt is just one possible algorithm for QR factorization; in practice linear-algebra libraries tend to use different algorithms instead.)

pafloxy · January 29, 2022, 3:13pm

Hi! thanks for the reply, however, could you be a bit more elaborate?
Specifically how exactly do I obtain the orthonormalized vectors from the given set of vectors?

P.S ~ I knew nothing about QR decomposition to this point.

stevengj · January 29, 2022, 3:16pm

Do using LinearAlgebra to load the LinearAlgebra module, then do Q = Matrix(qr(A).Q) where A is a matrix whose columns are the vectors you want to orthonormalize. This yields a matrix Q whose columns are the orthonormalized columns of A, equivalent (modulo roundoff errors) to Gram–Schmidt on the columns of A (also called the “thin” QR factor).

pafloxy · January 29, 2022, 3:18pm

Thanks for clarifying! I’m new to Julia and didn’t knew this stuff before now

stevengj · January 29, 2022, 3:27pm

No problem. This is not specific to Julia, but a lot of introductory linear-algebra courses don’t emphasize the connection of Gram–Schmidt to QR factorization.

Often, introductory linear algebra classes tend to emphasize simple (perhaps simplistic) algorithms for hand calculation, whereas practical numerical linear algebra emphasizes matrix factorizations. For example, LU factorization (instead of Gaussian elimination), QR factorization (instead of Gram–Schmidt), and diagonalization (instead of finding roots of characteristic polynomials).

Moreover, one often doesn’t work with the factors directly as matrices, but instead treats them as linear operators. For example, when working with QR factors (i.e. orthogonalized bases), typically one doesn’t need to look at the elements of Q explicitly, but instead one only needs to multiply by Q or Q^T. Julia’s qr(A).Q is actually an object that doesn’t literally store the elements of Q, but can be multiplied quickly by vectors and matrices. That’s why Matrix(qr(A).Q) is required if you want an explicit matrix of orthogonal vectors — in large-scale problems, you typically try to avoid this and instead work with Q implicitly as an operator.

VinodV · October 27, 2023, 5:07am

here is an implementation : Gram–Schmidt Orthogonalization with Julia - Julia Community 🟣

stevengj · April 11, 2024, 1:18pm

Note that this is “classical” Gram Schmidt (CGS), which is is numerically unstable (unless you do it twice). There is a simple variant called “modified Gram Schmidt” (MGS) that fixes this by a one-line change. Julia implementations of both CGS and MGS and an illustration of the roundoff errors are given in my MIT course notebook.

As explained above and is mentioned in the notebook, however, you are normally going to be much better off using qr(A), which computes the same result (accurately) via the Householder algorithm and is much faster than a textbook Gram–Schmidt implementation (whether classical or modified).

ctkelley · April 11, 2024, 4:16pm

This has come up before. You might look at this and this

sgaure · April 11, 2024, 5:33pm

There is also a fast orthonormalization via the Cholesky decomposition. Though one must be a bit careful, it’s not as numerically stable as the QR method.

using LinearAlgebra
A = rand(5,3)  # column vectors of length 5
U = cholesky(A'A).U
B = A*inv(U)   # orthonormal column vectors spanning the same subspace.

alternatively

L = cholesky(A'A).L
B = (L\A')'

stevengj · April 11, 2024, 5:58pm

Note that this is not faster in the complexity sense — all of these algorithms are \Theta(mn^2) for an m \times n matrix A. But it is probably faster in actual time (i.e. a better “constant coefficient” of the mn^2) for m \gg n, similar to solving least-squares problems by the normal equations as discussed in Efficient way of doing linear regression - #33 by stevengj … at the expense of accuracy as you point out.

Technically, there is no such thing as “not as stable”. It’s a boolean choice: an algorithm is either numerically stable or not. It’s a question of whether the (backward) error goes to zero (at worst) linearly with the precision \varepsilon or not, i.e. whether the backward error is O(\varepsilon).

This algorithm (which squares the condition number of A) is definitely much less accurate than Householder QR or modified Gram–Schmidt. I don’t know offhand whether it is numerically unstable (i.e., its backward errors decrease more slowly than linearly), but my guess it is that it is probably unstable.

Aside: One should get out of the habit of using matrix inverses for numerical calculations. One should instinctively do A / U rather than A * inv(U). In this case, A / U exploits the fact that U is upper triangular to avoid the O(n^3) complexity of inversion.

mstewart · April 12, 2024, 12:55pm

I don’t think there is much written about the stability of the Cholesky approach, perhaps because it isn’t taken very seriously as a useful algorithm. There is a throw-away comment in a fairly well cited technical report suggesting that it is better than classical Gram-Schmidt without reorthogonalization, but that seems to be stated mostly to disparage Gram-Schmidt without reorthogonalization.

To summarize the usual error bounds, the backward factorization errors \| A - QR\| / \|A\| are nicely bounded as being O(u) for unit round-off u in classical Gram-Schmidt, modified Gram-Schmidt, Cholesky, and Householder and Givens QR. The difference is in the numerical orthogonality of the computed Q. For that classical Gram-Schmidt satisfies no useful bound. For modified Gram-Schmidt it is O(u) \kappa(A). And for Householder and Givens it’s just O(u) with some factors depending on matrix size. Based on a quick and dirty error analysis, I’m pretty sure that for Cholesky it is O(u)\kappa^2(A), but I’ve never seen that written down anywhere. So I think it’s somewhere between classical and modified Gram-Schmidt. And you don’t have the option for reorthogonalization that you have with CGS. As far as I know, it’s an algorithm without any compelling use.

Oscar_Smith · April 12, 2024, 1:29pm

perhaps because it isn’t taken very seriously as a useful algorithm

This isn’t quite correct. Cholesky is great where applicable (only for Matrices that you know are already posdef, and with relatively small condition numbers). It’s less stable than QR, but it will be ~3x faster.

stevengj · April 12, 2024, 2:18pm

Here you are applying it to A'A, so it’s always applicable (in theory), but it squares the condition number. (Unlike applying Cholesky to solve Ax=b where A is SPD.)

mstewart · April 12, 2024, 2:28pm

Fair enough. I probably should have qualified that to be more about generally applicable algorithms.

You don’t need to worry about positive definiteness in the well conditioned case. A^T A is always going to be mathematically positive definite, although Cholesky might still fail if A is ill conditioned.

WalterMadelim · May 5, 2025, 10:46am

For me, the “Modified” in “Modified Gram-Schmidt” is due to the fact that A is mutated during its iterations, as opposed to the classical Gram-Schmidt.

using LinearAlgebra
N = 4
A = rand(-9:9, N, N)
rank(A) == N || error("A is not full column rank")
A0 = 1. * A # houses the primary "A" matrix
A *= 1.     # this matrix is mutable
Q, R = zeros(N, N), zeros(N, N)
for k in 1:N # a prototype code of Modified Gram-Schmidt
    a = A[:, k]
    R[k, k] = l = norm(a)
    Q[:, k] = q = a / l
    @info "the 2-norm of q[$k] is $(norm(q))" # for debugging purpose
    for j in k+1:N
        R[k, j] = r = q' * A[:, j]
        a = @. A[:, j] -= r * q
        if j == k+1 # This if-end block is for debugging purpose
            for i in 1:k
                @info "orthogonality error ($i, $j) is $(a' * Q[:, i])"
            end
        end
    end
end
Q' * Q - I |> norm  # orthonormality residue
Q * R .- A0 |> norm # "QR" residue

Its recursive orthogonality assurance is beautiful.

stevengj · May 5, 2025, 1:56pm

Please avoid reviving old threads (“necroposting”), especially with tangential asides (which in this case contains information already widely available, and code that is much less efficient than Matrix(qr(A).Q), and less accurate).

(The main legitimate reason to post on an old thread is for a significant erratum, e.g. if it was resolved with a solution that no longer works because an API has changed. Even then, one should be cautious except in high-profile threads, e.g. threads ranked highly on a google search for the topic.)

stevengj · May 5, 2025, 3:57pm

A post was split to a new topic: Questions about Cholesky and QR factorizations

Topic		Replies	Views
QR-like factorization preserving type? General Usage	12	1314	May 14, 2020
Gram–Schmidt Orthogonalization with Julia Teaching & Outreach linearalgebra , math	2	398	October 27, 2023
Gram Schmidit python to julia Performance linearalgebra	10	311	September 3, 2023
Orthonormalize a matrix in place Performance linearalgebra	10	2246	January 11, 2021
In-place computation of Q (from QR decomposition) General Usage	35	2457	November 30, 2022

Gram-Schmidt decomposition

Related topics