Why lu() factorization is not reducing the execution time?

Amro · September 24, 2021, 1:23am

I am using lu() to reduce the computation time but it turned out not as expected. Any idea why is that?

julia> A = [1 3 3 2;2 6 9 7; -1 -3 3 4]
3×4 Matrix{Int64}:
  1   3  3  2
  2   6  9  7
 -1  -3  3  4

julia> lu(A,check=false)
Failed factorization of type LU{Float64, Matrix{Float64}}

julia> b = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> @btime A\b
  3.388 μs (37 allocations: 70.50 KiB)
4-element Vector{Float64}:
 -0.09523809523809525
 -0.28571428571428575
  0.21904761904761905
  0.3142857142857143

julia> @btime U\(L\b[p])
  4.029 μs (39 allocations: 71.08 KiB)
4-element Vector{Float64}:
 -3.333333333333332
  1.999999999999999
 -1.6666666666666656
  1.6666666666666674

jling · September 24, 2021, 1:38am

first of all you’re benchmarking in global scope with non-const variables, second, the results are not the same?..

jling · September 24, 2021, 1:43am

what is U,L,p

stevengj · September 24, 2021, 3:20am

First, the right way to use an LU factorization is to call \ with the factorization object, not to individually apply the L and U factors yourself.:

julia> A = randn(1000,1000); b = randn(1000);

julia> LU = lu(A);

julia> @btime $A \ $b;
  9.904 ms (4 allocations: 7.64 MiB)

julia> @btime $LU \ $b;
  241.441 μs (1 allocation: 7.94 KiB)

Second, your matrix is too small:

For a 3x3 matrix, the overhead of LAPACK etcetera is enormous. StaticArrays will be much faster for such small problems if the size is fixed:

julia> A = randn(3,3); b = randn(3);

julia> LU = lu(A);

julia> sA = SMatrix{3,3}(A); sb = SVector{3}(b);

julia> @btime $A \ $b;
  417.249 ns (3 allocations: 384 bytes)

julia> @btime $LU \ $b;
  167.488 ns (1 allocation: 112 bytes)

julia> @btime $sA \ $sb;
  6.537 ns (0 allocations: 0 bytes)

stillyslalom · September 24, 2021, 2:17pm

Your matrix is rank-deficient, so the decomposition fails - unless you’re tied to a particular decomposition, you can instead use factorize(A), which dispatches to a polyalgorithm according to the properties of A (see this chart, or the docs). For example,

julia> f = factorize(A)
QRPivoted{Float64, Matrix{Float64}}
Q factor:
3×3 LinearAlgebra.QRPackedQ{Float64, Matrix{Float64}}:
 -0.301511  -0.275241  -0.912871
 -0.904534  -0.220193   0.365148
 -0.301511   0.935819  -0.182574
R factor:
3×4 Matrix{Float64}:
 -9.94987  -5.4272   -8.14081      -1.80907
  0.0      -4.95434   1.65145      -1.65145
  0.0       0.0       1.50772e-15   2.36998e-16
permutation:
4-element Vector{Int64}:
 3
 2
 4
 1

julia> f\b
4-element Vector{Float64}:
 -0.09523809523809522
 -0.2857142857142857
  0.2190476190476191
  0.3142857142857143

Julia’s speed requires that the compiler can figure out the machine representation (the type) of all the variables participating in a calculation. With non-constant global variables, you can redefine their types in the middle of a calculation, or between function calls, which makes it impossible for the compiler to optimize functions that depend on such variables. There’s little point to benchmarking a function if it’s hobbled by global variables - we know it’ll be slow.

stevengj · September 24, 2021, 3:15pm

The error message (“matrix is not square”) tells you exactly what is wrong — you can’t do an LU factorization of a non-square (3x4) matrix.

(In real applications, usually you know what kind of problem you are solving in advance, so you don’t need to resort to multiple factorization branches at runtime.)

stevengj · September 24, 2021, 4:28pm

See the \ documentation (type ?\ in the REPL):

For rectangular A the result is the minimum-norm least squares solution computed by a pivoted QR factorization of A and a rank estimate of A based on the R factor.

In your case, you have a 3x4 matrix A, so Ax=b is an under-determined system of equations with infinitely many solutions, and as explained above it returns the solution with the smallest possible norm.

inv(A) does not exist for a non-square matrix. However, A \ b is equivalent (numerically, not computationally) to pinv(A) * b, where pinv is the pseudo-inverse. Computationally, one rarely needs to compute a matrix inverse (or pseudo-inverse) explicitly.

Well, you could type ==(size(A)...) if you want something more compact. There is also LinearAlgebra.checksquare, but that throws an error if the matrix is non-square (it’s intended for use in functions that require a square matrix.

What is your application that you don’t know ahead of time if you want an exact solution or a least-square solution (whether your problem is over/underdetermined or square/non-singular)?

You can do qr(A, val(true)) \ b if you know that you always want to use the QR minimum-norm/least-square solution (which gives the exact solution if A is square and invertible).

apo383 · September 24, 2021, 5:10pm

Please take a look at the documentation, which should answer most of your questions.

It is generally faster and numerically better to do A\B rather than inv(A)*B. More info is available in any modern linear algebra class or book that talks about computations. If the book uses Matlab then they will explicitly use \. Edit: fix backslash

Amro · September 24, 2021, 5:11pm

Thank you for your guide.

jling · September 24, 2021, 5:22pm

when the matrix is sparse…

jling · September 24, 2021, 5:24pm

yes because doing A and B is always gonna be slower than just doing A, in this case, lu() is needed for inv():

github.com

JuliaLang/julia/blob/53da724417802166a5fd7fd4ef05bd7c85d4e1c5/stdlib/LinearAlgebra/src/dense.jl#L879


      
          
          function inv(A::StridedMatrix{T}) where T
              checksquare(A)
              S = typeof((one(T)*zero(T) + one(T)*zero(T))/one(T))
              AA = convert(AbstractArray{S}, A)
              if istriu(AA)
                  Ai = triu!(parent(inv(UpperTriangular(AA))))
              elseif istril(AA)
                  Ai = tril!(parent(inv(LowerTriangular(AA))))
              else
                  Ai = inv!(lu(AA))
                  Ai = convert(typeof(parent(Ai)), Ai)
              end
              return Ai
          end
          
          """
              cos(A::AbstractMatrix)
          
          Compute the matrix cosine of a square matrix `A`.

Topic		Replies	Views
LU factorization and non-square matrices Numerics linearalgebra	6	1403	January 19, 2024
Performance gotcha in linear algebra lu() General Usage performance , linearalgebra	33	3619	February 11, 2020
LU Factorization General Usage question	3	334	September 26, 2022
Performance of lu(A)\B versus inv(A)*B to solve AX=B Numerics	3	440	December 8, 2023
How to check if LU factorization failed? Numerics question , linearalgebra	29	5035	September 20, 2017

Why lu() factorization is not reducing the execution time?

Related topics