QR decomposition with julia. How to?

Even if you want the QR factors too, you can take the factorization object and use \ for the least-square solution:

julia> using LinearAlgebra

julia> X = rand(100,3); y = rand(100);

julia> F = qr(X);

julia> F \ y ≈ X \ y ≈ F.R \ (Matrix(F.Q)' * y) ≈ F.R \ (F.Q' * y)[1:size(X,2)]
true

One of the tricky things about the QR factorization is that there are two variants when the m \times n matrix X is “tall” (m > n) (as is typical in least-square problems): the “thin” variant X = \hat{Q}\hat{R} in which \hat{Q} is the same “tall” m \times n size as X and \hat{R} is square, and the “full” variant X = QR in which Q is a square unitary m \times m matrix and R is “tall” m \times n. If you do F = qr(X) then:

  • F.Q represents the square unitary matrix Q from the “full” QR factorization. (This is also what you get from Q, R = qr(X).) Technically, it doesn’t actually store this matrix explicitly — instead, it has a compact representation of the linear operator Qx so that F.Q * x and/or F.Q' * y (=Q^* y) can be computed quickly. (Essentially, this is what the underlying numerical algorithms “natively” give you.)
  • Matrix(F.Q) gives you \hat{Q}, the “thin” QR factor, as an explicit matrix. Computing this is not free, however — it basically works by multiplying F.Q by the first n columns of the identity matrix I — so you generally avoid constructing it if you can.
  • F.R always gives you the square upper-triangular matrix \hat{R} from the thin QR. (This is also what you get from Q, R = qr(X).)

When you find the least-square solution from QR, what you want is \hat{R}^{-1} \hat{Q}^* y using the thin QR factors. In particular, this means that if you want to compute \hat{Q}^* y you cannot simply use F.Q' * y, and instead need to either use something like Matrix(F.Q)' * y or, more efficiently, (F.Q' * y)[1:n].

It’s generally better to use F \ y or even better X \ y (which uses a pivoted QR factorization automatically) to solve least-square problems:

  • F \ y will deal with the size of the Q matrix for you efficiently. It will also do the upper-triangular solve \hat{R}^{-1} \text{(vector)} efficiently — since this is an upper-triangular system you can solve it in O(n^2) operations using backsubstitution. (Get out of the habit of inverting matrices! inv(A) * b is almost never going to be the best way to solve a linear system.)
  • X \ y is even better, because it defaults to a pivoted QR factorization — corresponding to F = qr(X, ColumnNorm()) — which deals better with badly conditioned X (e.g. nearly parallel columns). That’s not what qr(X) gives you by default because it involves an extra permutation matrix compared to ordinary QR. i.e. it is a factorization XP = QR where P is a permutation. See also Efficient way of doing linear regression - #33 by stevengj
  • Even if you want to explicitly compute \hat{R}^{-1} z for some z, it makes my eyes water to see people write inv(R) * z (in Julia) or inv(R).dot(z) (in Python). Get out of the habit of inverting matrices. At the very least, do R \ z (in Julia) or scipy.linalg.solve(R, z) (in Python). Better yet, take advantage of the fact that R is upper-triangular by doing UpperTriangular(R) \ z (in Julia) or scipy.linalg.solve_triangular(R, z) (in Python)
2 Likes