Even if you want the QR factors too, you can take the factorization object and use \ for the least-square solution:
julia> using LinearAlgebra
julia> X = rand(100,3); y = rand(100);
julia> F = qr(X);
julia> F \ y ≈ X \ y ≈ F.R \ (Matrix(F.Q)' * y) ≈ F.R \ (F.Q' * y)[1:size(X,2)]
true
One of the tricky things about the QR factorization is that there are two variants when the m \times n matrix X is “tall” (m > n) (as is typical in least-square problems): the “thin” variant X = \hat{Q}\hat{R} in which \hat{Q} is the same “tall” m \times n size as X and \hat{R} is square, and the “full” variant X = QR in which Q is a square unitary m \times m matrix and R is “tall” m \times n. If you do F = qr(X) then:
F.Qrepresents the square unitary matrix Q from the “full” QR factorization. (This is also what you get fromQ, R = qr(X).) Technically, it doesn’t actually store this matrix explicitly — instead, it has a compact representation of the linear operator Qx so thatF.Q * xand/orF.Q' * y(=Q^* y) can be computed quickly. (Essentially, this is what the underlying numerical algorithms “natively” give you.)Matrix(F.Q)gives you \hat{Q}, the “thin” QR factor, as an explicit matrix. Computing this is not free, however — it basically works by multiplyingF.Qby the first n columns of the identity matrix I — so you generally avoid constructing it if you can.F.Ralways gives you the square upper-triangular matrix \hat{R} from the thin QR. (This is also what you get fromQ, R = qr(X).)
When you find the least-square solution from QR, what you want is \hat{R}^{-1} \hat{Q}^* y using the thin QR factors. In particular, this means that if you want to compute \hat{Q}^* y you cannot simply use F.Q' * y, and instead need to either use something like Matrix(F.Q)' * y or, more efficiently, (F.Q' * y)[1:n].
It’s generally better to use F \ y or even better X \ y (which uses a pivoted QR factorization automatically) to solve least-square problems:
F \ ywill deal with the size of the Q matrix for you efficiently. It will also do the upper-triangular solve \hat{R}^{-1} \text{(vector)} efficiently — since this is an upper-triangular system you can solve it in O(n^2) operations using backsubstitution. (Get out of the habit of inverting matrices!inv(A) * bis almost never going to be the best way to solve a linear system.)X \ yis even better, because it defaults to a pivoted QR factorization — corresponding toF = qr(X, ColumnNorm())— which deals better with badly conditionedX(e.g. nearly parallel columns). That’s not whatqr(X)gives you by default because it involves an extra permutation matrix compared to ordinary QR. i.e. it is a factorization XP = QR where P is a permutation. See also Efficient way of doing linear regression - #33 by stevengj- Even if you want to explicitly compute \hat{R}^{-1} z for some z, it makes my eyes water to see people write
inv(R) * z(in Julia) orinv(R).dot(z)(in Python). Get out of the habit of inverting matrices. At the very least, doR \ z(in Julia) orscipy.linalg.solve(R, z)(in Python). Better yet, take advantage of the fact thatRis upper-triangular by doingUpperTriangular(R) \ z(in Julia) orscipy.linalg.solve_triangular(R, z)(in Python)