QR decomposition with julia. How to?

Hello, good morning.

How to procede with QR decomposition with julia? I tryed to port a python code, but im having trouble with qr julia function from linearalgebra module. Issues with matrix size and dimensions.

X = df.values[:, :-1].flatten()
X1 = np.vstack([X, np.ones(len(X))]).transpose()
Y = df.values[:, -1]

Q, R = qr(X1)
b = inv(R).dot(Q.transpose()).dot(Y)
print(b)

What have you tried? The Python code you’ve posted isn’t very helpful as it can’t be run (it uses an undefined variable df).

The help for qr includes an example:

  julia> A = [3.0 -6.0; 4.0 -8.0; 0.0 1.0]
  3×2 Matrix{Float64}:
   3.0  -6.0
   4.0  -8.0
   0.0   1.0

  julia> F = qr(A)
  LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
  Q factor:
  3×3 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}:
   -0.6   0.0   0.8
   -0.8   0.0  -0.6
    0.0  -1.0   0.0
  R factor:
  2×2 Matrix{Float64}:
   -5.0  10.0
    0.0  -1.0

Hello, thanks for your reply. Tested it, but Im having trouble to port my code. Ill post the entire code here and the output.

import pandas as pd
from numpy.linalg import qr, inv
import numpy as np

df = pd.read_csv("https://bit.ly/3goOAnt", delimiter=",")
X = df.values[:, :-1].flatten()
X1 = np.vstack([X, np.ones(len(X))]).transpose()
Y = df.values[:, -1]

Q, R = qr(X1)
b = inv(R).dot(Q.transpose()).dot(Y)
print(b)
[1.93939394 4.73333333]

Python did some things behind scenes I cant figure to fix in code Im porting. The transpose maybe is not necessary. But its hard to got it. Debugging both at the same time here; Almost sure that is a transpose in a place that should not be.

It looks like you are trying to solve the linear system Ay=b. If so, the \ operator is probably what you are looking for. Unless of course you need the QR factors again later on.

julia> X = rand(2,2)
2×2 Matrix{Float64}:
 0.466648   0.0506922
 0.0524715  0.170719

julia> y = rand(2)
2-element Vector{Float64}:
 0.28054948348570896
 0.34844414575884997

julia> rank(X)
2

julia> Q,R = qr(X)
LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
Q factor:
2×2 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}:
 -0.993738  -0.111739
 -0.111739   0.993738
R factor:
2×2 Matrix{Float64}:
 -0.469589  -0.0694508
  0.0        0.163986

julia> Q*R ≈ X
true

julia> inv(R)*transpose(Q)*y ≈ X\y
true

For solving large systems, LinearSolve.jl should be considered.

1 Like

One recurrent source of confusion is that the Q factor is a special type:

julia> a = randn(3,2);
julia> qr(a).Q
3×3 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}:
 -0.458928   0.868731  -0.186255
 -0.840989  -0.357132   0.40644
  0.286569   0.343166   0.894492

julia> Matrix(qr(a).Q)
3×2 Matrix{Float64}:
 -0.458928   0.868731
 -0.840989  -0.357132
  0.286569   0.343166

This originates from two notions of the QR decomposition of rectangular matrices (the “full-size” and “economy-size”). I didn’t look at the snippet you posted, but that’s something to keep in mind when comparing with other languages (I’m not sure what python does)

1 Like

python needs to transpose a matrix in place to generate a 10x2 one:

X1 = np.vstack([X, np.ones(len(X))]).transpose()

julia dont need as vstack or [x y] already create it correctly. That is one issue I found it here checking the values one by one. When I finnish I’ll return the code here.

But interesting how numpy takes about matrix… it needs to be aware this caveits.

Solved here

X, Y = df.x, df.y;
X1 = [X ones(length(X))];
QR = qr(X1);

(inv(QR.R) * QR.Q') * Y
 1.9393939393939394
 4.733333333333333

Even if you want the QR factors too, you can take the factorization object and use \ for the least-square solution:

julia> using LinearAlgebra

julia> X = rand(100,3); y = rand(100);

julia> F = qr(X);

julia> F \ y ≈ X \ y ≈ F.R \ (Matrix(F.Q)' * y) ≈ F.R \ (F.Q' * y)[1:size(X,2)]
true

One of the tricky things about the QR factorization is that there are two variants when the m \times n matrix X is “tall” (m > n) (as is typical in least-square problems): the “thin” variant X = \hat{Q}\hat{R} in which \hat{Q} is the same “tall” m \times n size as X and \hat{R} is square, and the “full” variant X = QR in which Q is a square unitary m \times m matrix and R is “tall” m \times n. If you do F = qr(X) then:

  • F.Q represents the square unitary matrix Q from the “full” QR factorization. (This is also what you get from Q, R = qr(X).) Technically, it doesn’t actually store this matrix explicitly — instead, it has a compact representation of the linear operator Qx so that F.Q * x and/or F.Q' * y (=Q^* y) can be computed quickly. (Essentially, this is what the underlying numerical algorithms “natively” give you.)
  • Matrix(F.Q) gives you \hat{Q}, the “thin” QR factor, as an explicit matrix. Computing this is not free, however — it basically works by multiplying F.Q by the first n columns of the identity matrix I — so you generally avoid constructing it if you can.
  • F.R always gives you the square upper-triangular matrix \hat{R} from the thin QR. (This is also what you get from Q, R = qr(X).)

When you find the least-square solution from QR, what you want is \hat{R}^{-1} \hat{Q}^* y using the thin QR factors. In particular, this means that if you want to compute \hat{Q}^* y you cannot simply use F.Q' * y, and instead need to either use something like Matrix(F.Q)' * y or, more efficiently, (F.Q' * y)[1:n].

It’s generally better to use F \ y or even better X \ y (which uses a pivoted QR factorization automatically) to solve least-square problems:

  • F \ y will deal with the size of the Q matrix for you efficiently. It will also do the upper-triangular solve \hat{R}^{-1} \text{(vector)} efficiently — since this is an upper-triangular system you can solve it in O(n^2) operations using backsubstitution. (Get out of the habit of inverting matrices! inv(A) * b is almost never going to be the best way to solve a linear system.)
  • X \ y is even better, because it defaults to a pivoted QR factorization — corresponding to F = qr(X, ColumnNorm()) — which deals better with badly conditioned X (e.g. nearly parallel columns). That’s not what qr(X) gives you by default because it involves an extra permutation matrix compared to ordinary QR. i.e. it is a factorization XP = QR where P is a permutation. See also Efficient way of doing linear regression - #33 by stevengj
  • Even if you want to explicitly compute \hat{R}^{-1} z for some z, it makes my eyes water to see people write inv(R) * z (in Julia) or inv(R).dot(z) (in Python). Get out of the habit of inverting matrices. At the very least, do R \ z (in Julia) or scipy.linalg.solve(R, z) (in Python). Better yet, take advantage of the fact that R is upper-triangular by doing UpperTriangular(R) \ z (in Julia) or scipy.linalg.solve_triangular(R, z) (in Python)
2 Likes

This is even worse than just inverting R (which you should also generally avoid) — by putting the parentheses like this, you are doing a matrix–matrix multiplication followed by a matrix–vector multiplicatio rather than two matrix–vector multiplication. (AB)x is much slower than A(Bx) when A and B are matrices and x is a vector.

If your matrices are small and you don’t care about performance, then maybe you don’t care, but you really want to get into the habit of thinking more carefully about this kind of thing in case your matrices ever get big. In any case, there are better ways to solve least-square problems as I explained above.

PS. That being said, inv(R) * Q' does automagically pick the right size of Q for you — it ends up corresponding to \hat{R}^{-1} \hat{Q}^* from the thin QR. But it’s even easier to do QR \ Y or X \ Y, which do the right thing.

4 Likes

Code is just for learning julia. Im porting a lot of exercises I have in python, but your tips about performance is very apreciated. I will do like you have demonstred, as because in real world we will doing linear regressions in a large sample. Thank you