Why lu() factorization is not reducing the execution time?

I am using lu() to reduce the computation time but it turned out not as expected. Any idea why is that?

julia> A = [1 3 3 2;2 6 9 7; -1 -3 3 4]
3×4 Matrix{Int64}:
  1   3  3  2
  2   6  9  7
 -1  -3  3  4

julia> lu(A,check=false)
Failed factorization of type LU{Float64, Matrix{Float64}}

julia> b = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> @btime A\b
  3.388 μs (37 allocations: 70.50 KiB)
4-element Vector{Float64}:
 -0.09523809523809525
 -0.28571428571428575
  0.21904761904761905
  0.3142857142857143

julia> @btime U\(L\b[p])
  4.029 μs (39 allocations: 71.08 KiB)
4-element Vector{Float64}:
 -3.333333333333332
  1.999999999999999
 -1.6666666666666656
  1.6666666666666674

first of all you’re benchmarking in global scope with non-const variables, second, the results are not the same?..

what is U,L,p

First, the right way to use an LU factorization is to call \ with the factorization object, not to individually apply the L and U factors yourself.:

julia> A = randn(1000,1000); b = randn(1000);

julia> LU = lu(A);

julia> @btime $A \ $b;
  9.904 ms (4 allocations: 7.64 MiB)

julia> @btime $LU \ $b;
  241.441 μs (1 allocation: 7.94 KiB)

Second, your matrix is too small:

For a 3x3 matrix, the overhead of LAPACK etcetera is enormous. StaticArrays will be much faster for such small problems if the size is fixed:

julia> A = randn(3,3); b = randn(3);

julia> LU = lu(A);

julia> sA = SMatrix{3,3}(A); sb = SVector{3}(b);

julia> @btime $A \ $b;
  417.249 ns (3 allocations: 384 bytes)

julia> @btime $LU \ $b;
  167.488 ns (1 allocation: 112 bytes)

julia> @btime $sA \ $sb;
  6.537 ns (0 allocations: 0 bytes)

Your matrix is rank-deficient, so the decomposition fails - unless you’re tied to a particular decomposition, you can instead use factorize(A), which dispatches to a polyalgorithm according to the properties of A (see this chart, or the docs). For example,

julia> f = factorize(A)
QRPivoted{Float64, Matrix{Float64}}
Q factor:
3×3 LinearAlgebra.QRPackedQ{Float64, Matrix{Float64}}:
 -0.301511  -0.275241  -0.912871
 -0.904534  -0.220193   0.365148
 -0.301511   0.935819  -0.182574
R factor:
3×4 Matrix{Float64}:
 -9.94987  -5.4272   -8.14081      -1.80907
  0.0      -4.95434   1.65145      -1.65145
  0.0       0.0       1.50772e-15   2.36998e-16
permutation:
4-element Vector{Int64}:
 3
 2
 4
 1

julia> f\b
4-element Vector{Float64}:
 -0.09523809523809522
 -0.2857142857142857
  0.2190476190476191
  0.3142857142857143

Julia’s speed requires that the compiler can figure out the machine representation (the type) of all the variables participating in a calculation. With non-constant global variables, you can redefine their types in the middle of a calculation, or between function calls, which makes it impossible for the compiler to optimize functions that depend on such variables. There’s little point to benchmarking a function if it’s hobbled by global variables - we know it’ll be slow.

The error message (“matrix is not square”) tells you exactly what is wrong — you can’t do an LU factorization of a non-square (3x4) matrix.

(In real applications, usually you know what kind of problem you are solving in advance, so you don’t need to resort to multiple factorization branches at runtime.)

See the \ documentation (type ?\ in the REPL):

For rectangular A the result is the minimum-norm least squares solution computed by a pivoted QR factorization of A and a rank estimate of A based on the R factor.

In your case, you have a 3x4 matrix A, so Ax=b is an under-determined system of equations with infinitely many solutions, and as explained above it returns the solution with the smallest possible norm.

inv(A) does not exist for a non-square matrix. However, A \ b is equivalent (numerically, not computationally) to pinv(A) * b, where pinv is the pseudo-inverse. Computationally, one rarely needs to compute a matrix inverse (or pseudo-inverse) explicitly.

Well, you could type ==(size(A)...) if you want something more compact. There is also LinearAlgebra.checksquare, but that throws an error if the matrix is non-square (it’s intended for use in functions that require a square matrix.

What is your application that you don’t know ahead of time if you want an exact solution or a least-square solution (whether your problem is over/underdetermined or square/non-singular)?

You can do qr(A, val(true)) \ b if you know that you always want to use the QR minimum-norm/least-square solution (which gives the exact solution if A is square and invertible).

Please take a look at the documentation, which should answer most of your questions.

It is generally faster and numerically better to do A\B rather than inv(A)*B. More info is available in any modern linear algebra class or book that talks about computations. If the book uses Matlab then they will explicitly use \. Edit: fix backslash

Thank you for your guide.

when the matrix is sparse…

yes because doing A and B is always gonna be slower than just doing A, in this case, lu() is needed for inv():