I’m not a numeric analyst, but won’t the condition number of (X’ * X) \ (X’ * Y) be the square of that of X \ Y?

It is but it’s usually not a problem in statistical applications. If `X`

is almost rank deficient then you’ll have a lot of uncertainty about the parameter estimates. The amplified noise from the squared condition number will be nothing compared to the statistical uncertainty. In some applications, you expect a perfect fit and in that case it can matter if you use QR or the Cholesky.