Hello, this is about a recursive implementation of the sqrtm
algorithm. The idea is that, instead of finding the solution X
to X^2 = A
on a pointwise basis, calculating each entry starting with the diagonal and working outwards, A
is split up into [A11 A12; 0 A22]
, smaller solutions X11, X22
are found as before, and X12
is the result of the Sylvester equation. This follows Deadman Higham Ralha and should produce a sizeable speedup. … But it’s not as fast as I would like it to be.
Any improvements that can be made to recsqrtm
? Also in memory usage
function recsqrtm{T}(A::AbstractArray{T})
S = schurfact(A)
S[:vectors] * _recsqrtm(S[:Schur]) * S[:vectors]'
end
function _recsqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)
@inbounds if n == 1
R[1,1] = sqrt(A[1,1])
elseif n == 2
R[1,1] = r11 = sqrt(A[1,1])
R[2,2] = r22 = sqrt(A[2,2])
R[1,2] = A[1,2] / (r11 + r22)
else
n1 = Int(ceil(n/2))
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _recsqrtm(A[r1,r1])
R[r2,r2] = _recsqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end
From ProfileView
most of the time is spent in trsyl!
.
How does one calibrate this kind of thing? See below for block-wise recursive code. How does one choose an appropriate block size? The block size (the point at which one reverts to the pointwise sqrtm
method) doesn’t seem to matter much, whether 1
(as above, in recsqrtm
) or 64
as below. Is there a standard set of matrices/matrix types to test against, or systems to test on? Looks to me like recursive performance is an improvement from matrices of dimension 1000 - 2000, so should that be hard-coded? What about parallel computation?
function _blksqrtm{T}(A::AbstractArray{T})
n = Base.LinAlg.checksquare(A)
@inbounds if n < 65
return sqrtm(UpperTriangular(A),Val{true})
else
t = typeof(sqrt(zero(T)))
R = zeros(t,n,n)
n1 = Int(ceil(n/2))
if !iszero(A[n1+1,n1]) # hit a 2x2-block on diagonal, avoid
n1 -= 1
end
r1 = 1:n1
r2 = (n1+1):n
R[r1,r1] = _blksqrtm(A[r1,r1])
R[r2,r2] = _blksqrtm(A[r2,r2])
R[r1,r2] = sylvester(R[r1,r1],R[r2,r2],-A[r1,r2])::Matrix{t}
end
R
end