Turns out it’s faster to avoid the extra preparation anyway. You can use an adjoint wrapper instead of materializing L
, and the resulting computations are even faster:
julia> function myldiv(chol, x)
if chol.uplo == 'U'
return ldiv!(chol.U, chol.U' \ x)
elseif chol.uplo == 'L'
return ldiv!(chol.L', chol.L \ x)
else
error("unreachable")
end
end
myldiv (generic function with 1 method)
julia> @benchmark myldiv($Achol, $b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 120.500 μs … 276.959 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 129.959 μs ┊ GC (median): 0.00%
Time (mean ± σ): 130.370 μs ± 8.761 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▁ █
██▆▄▅▅▅▄▄▄▃▃▃▃▃▃▃▄▄██▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂ ▃
120 μs Histogram: frequency by time 166 μs <
Memory estimate: 8.00 KiB, allocs estimate: 1.
Compare this to my timings above for the U \ (L \ $b)
version on the same computer:
Range (min … max): 132.750 μs … 227.459 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 145.292 μs ┊ GC (median): 0.00%
Time (mean ± σ): 145.427 μs ± 5.723 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▁▄▄▄▄▄▄▃▂▄▄▄▄▆▅█▆▅▄▆▄▄▅▅▂▃
▁▁▁▁▂▃▄▅▄▇▇███████████████████████████▇▇▅▆▄▃▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁ ▅
133 μs Histogram: frequency by time 162 μs <
Memory estimate: 16.00 KiB, allocs estimate: 2.
(Note that I used ldiv!
to avoid an extra allocation in the new version, but that doesn’t significantly affect the timings, so the comparison is still valid. The O(N) allocation cost is dwarfed by the O(N^2) computation cost at this size.)