Documentation of cholesky() and allocations

Hi,

Today I found out that after using C=cholesky(A), C.L allocates, while C.U does not. After changing a literally one line of my program from C.L to C.U, (which was using suspiciously high ram) I got a very, very significant reduction in allocations:

julia> @btime out_struct = beavar($model_type, $set_struct, $hyp_struct, $data_struct);
  161.937 s (111517 allocations: 39.60 GiB)

julia> @btime out_struct = beavar($model_type, $set_struct, $hyp_struct, $data_struct);
  151.357 s (105517 allocations: 502.98 MiB)

I checked ?cholesky and the docs and couldn’t find a mention of this. In the documentation there is a line

Iterating the decomposition produces the components L and U.

which to me always suggested they are both there.

Am I missing something obvious? Wouldn’t it be helpful if there was one line before the examples section in the documentation that warns about this? I asked AI and it said it is implied somewhere but I couldn’t find that either.

I really struggled where to put this topic, so decided to put it in offtopic. Feel free to move if there is a better place.

It’s not so simple: that is only true if getfield(C, :uplo) == 'U'. In some cases, you might have getfield(C, :uplo) == 'L', in which case C.U allocates and C.L does not. For example, if you are passing a Symmetric or Hermitian array, it depends on whether the input array stored the upper or lower triangle:

julia> A = rand(3,3); A = A'A;

julia> C = cholesky(hermitianpart(A, :U))
Cholesky{Float64, Matrix{Float64}}
U factor:
3Ă—3 UpperTriangular{Float64, Matrix{Float64}}:
 1.16081  0.726865  1.33298
  â‹…       0.551859  0.0797002
  â‹…        â‹…        0.271936

julia> getfield(C, :uplo)
'U': ASCII/Unicode U+0055 (category Lu: Letter, uppercase)

julia> C = cholesky(hermitianpart(A, :L))
Cholesky{Float64, Matrix{Float64}}
L factor:
3Ă—3 LowerTriangular{Float64, Matrix{Float64}}:
 1.16081    â‹…          â‹… 
 0.726865  0.551859    â‹… 
 1.33298   0.0797002  0.271936

julia> getfield(C, :uplo)
'L': ASCII/Unicode U+004C (category Lu: Letter, uppercase)

Out of curiosity, what are you doing with C that you need to work with C.L or C.U directly, instead of using the C object?

But doesn’t that mean, that the default behavior is in fact that C.L allocates and C.U does not unless you either specify it the other way around with an additional keyword or pass an input array that you have specified its triangle part? Or is it possible that my code switches without me knowing?

If that is the default behaviour, stupid people like me could benefit from having that information somewhere…

Regarding your curiosity, the answer might be a bit disappointing. I am not doing anything interesting and do not have a special reason to not use the cholesky object. I just didn’t know any better. To be honest I didn’t even know that was possible until today, when I saw another thread here that you can do that.

I simply wrote a code how I would write it in matlab due to my inexperience with Julia (honestly probably too much experience in Matlab that hinders me to learn other things). I needed the lower triangular Cholesky decomposition and it’s transpose and I thought that since the cholesky object has both already, I can just call them instead of transposing :smiley:

To answer the question my code comes from the two equations on page 6 here., where I have a large variance covariance matrix \mathbf{K_A} and an expression that evaluates to a vector, call it \mathbf{b} and the goal is to evaluate \mathbf{\hat{\beta}} = \mathbf{K_A}^{-1}\ \mathbf{b}. To avoid inverting \mathbf{K_A} the paper calculates the following quantity:

\mathbf{\hat{\beta}} = \mathbf{C_{K_A}'} \backslash (\mathbf{C_{K_A}} \backslash \mathbf{b} ),

where \mathbf{C_{K_A}} is the lower triangular cholesky matrix. I translated that to:

 ldiv!(cholK_β.U,ldiv!((cholK_β.L),prior_mean))

The inner part allocates 20MB this way

ldiv!((cholK_β.L),prior_mean)

and none that way

ldiv!((cholK_β.U'),prior_mean)

I saw another thread today where they do use the cholesky object with a three-argument ldiv! and I translated it to my code but didn’t see much difference between me using the C.L directly and using the object.

julia> @btime ldiv!($beta_hat,$cholK_β,$prior_mean);
  1.754 ms (0 allocations: 0 bytes)

julia> @btime ldiv!(($cholK_β.U'),$prior_mean);
  341.000 ÎĽs (0 allocations: 0 bytes)

Using the object with C \setminus b is equivalent to U \setminus L \setminus b, not to just L \setminus b. That being said, there is an issue (hopefully to be resolved soon) where the Cholesky-object ldiv! is currently slower than it needs to be — see Ldiv for Cholesky is slower than two substitutions - #6 by danielwe and Triangular solve and inversions have poor performance · Issue #1405 · JuliaLang/LinearAlgebra.jl · GitHub

PS. I would generally wrap C.U in an UpperTriangular object to make sure that Julia exploits its structure, although in practice ldiv! checks for this.